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ABSTRACT 

^  sLcufftrt'  rf'V.: r  o-r 

Described  here  is  the  structure  and 
theory  for  a  sequential  quadratic 
programming  algorithm  for  solving  large, 
sparse  nonlinear  optimization  problems. 
Also  provided  are  the  details  of  a 
computer  implementation  of  the 
algorithm,  along  with  test  results.  The 
algorithm  is  based  on  Han's  sequential 
quadratic  programming  method.  It 
maintains  a  sparse  approximation  to  the 
Cholesky  factor  of  the  Hessian  of  the 
Lagrangian  and  stores  all  gradients  in  a 
sparse  format.  The  solution  to  the 
quadratic  program  generated  at  each  step 
is  obtained  by  solving  the  dual 
quadratic  program  using  a  projected 
conjugate  gradient  algorithm.  Since 
only  active  constraints  are  considered 
in  forming  the  dual,  the  dual  problem 
will  normally  be  much  smaller  than  the 
primal  quadratic  program  and,  hence, 
much  easier  to  solve.  An  updating 
procedure  is  employed  that  does  not 
destroy  sparsity. 

Several  test  problems,  ranging  in 
size  from  5  to  60  variables  were  solved 
with  the  algorithm.  These  results 
indicate  that  the  algorithm  has  the 
potential  to  solve  large,  sparse 
nonlinear  programs.  The  algorithm  is 
especially  attractive  for  solving 
problems  having  nonlinear  constraints. 
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INTRODUCTION 


Constrained  nonlinear  optimization  has  been  the  subject  of  a 
significant  amount  of  research  during  the  past  two  decades.  As  a 
result,  a  variety  of  different  types  of  algorithms  for  solving  nonlinear 
programs  have  been  developed  and  tested.  (See  Lasdon  and  Waren  [1979] 
for  a  report  on  the  status  of  nonlinear  programming  software.)  Many  of 
these  algorithms,  some  based  on  the  sequential  quadratic  programming 
(SQP)  method  to  be  described  later,  have  been  found  quite  efficient  for 
solving  small  to  medium-sized  problems.  As  yet,  however,  there  have 
been  few  attempts  to  construct  algorithms  for  solving  large-scale 
nonlinear  programs . 

Those  algorithms  for  which  software  now  exists  are  not  readily 
adapted  to  large  problems  because  they  typically  do  not  take  advantage 
of  the  sparsity  of  the  Hessian  matrices  normally  associated  with  large- 
scale  systems.  This  is  a  serious  defect  since  the  storage  and  handling 
of  large,  dense  Hessian  matrices  is  prohibitively  expensive.  Those  few 
algorithms  that  have  been  specifically  designed  for  large-scale  problems 
are  normally  considered  most  efficient  for  special  types  of  problems, 
such  as  geometric  programs  or  those  with  linear  constraints  (for 
Instance,  see  ftirtagh  and  Saunders  [1982]). 


This  paper  presents  an  algorithm,  using  the  aforementioned  SQP 
method,  that  has  been  developed  to  handle  large  nonlinear  programs, 
including  those  with  nonlinear  constraints. 

This  algorithm  has  several  unique  features.  It  maintains  a  sparse 
approximation  to  the  Cholesky  factorization  of  the  Hessian  of  the 
Lagrangian  function.  The  quadratic  program  generated  at  each  iteration 
of  the  SQP  method  is  transformed  into  a  quadratic  program  having  only 
nonnegativity  constraints  corresponding  to  the  multipliers  associated 
with  inequality  constraints  in  the  original  nonlinear  program.  The 
transformed  quadratic  program,  which  is  always  feasible,  is  then  solved 
using  a  projected  preconditioned  conjugate  gradient  (CG)  method,  and 
finally,  the  algorithm  uses  a  quasi-Newton  update  scheme  for  the 
factorization  of  the  Hessian  approximation  that  ignores  fill-in. 

The  general  form  of  the  nonlinear  program  to  be  considered  here  is 

(1)  min  f(x) 
x  e  Rn 

subject  to  8j(x)  <0.  j  ■  1,  2,  . . . ,  m, 
h^(x)  ■  0,  k  ■  m  +  1 . . 

We  will  normally  assume  that  f,  gL,  g2 . 8m'  ''m  +  1 . \  +  p  are 

twice  continuously  differentiable. 


NOTATION 


We  follow  the  following  notatlonal  conventions.  The  gradient  of  a 
real-valued  function  f  of  the  vector  x  will  be  denoted  by  Vf(x),  the 
Hessian  of  f  by  V^f(x).  The  multiplier  vector  will  be  written  as 
(u',  v')\  where  u  is  the  multiplier  vector  corresponding  to  inequality 
constraints  and  v  is  the  multiplier  vector  for  equality  constraints. 

The  transpose  of  a  matrix  Q  will  be  denoted  by  Q’ .  Likewise,  the 
transpose  of  the  column  vector  u  will  be  the  row  vector  u’ .  Note  that 
no  special  notation  is  used  for  the  multipliers  corresponding  to  upper 
or  lower  bounds  on  variables. 

The  Lagrangian  function  will  be  denoted  by  i(x,  u,  v)  with  the 
Hessian  of  the  Lagrangian  denoted  by  7xxJL(x,  u,  v)  and  its  positive 
definite  representation  by  Q*LLf  where  L  is  a  lower  triangular  matrix. 

The  step  vector  will  be  s  and  the  step  length  parameter  will  be 
a.  The  current  estimate  of  a  solution  will  be  given  by  xc  and  a  new 
estimate  by  x11. 


For  a  scalar  a,  [a]+  *  max{0,  a},  |a|  *  max{-a,  a}.  For  a  vector 
z,  let  | z|  ■  max{| z^ j }  and  | z | 2  “  t  J  zj  fa  • 

Let  A  be  a  matrix.  The  i,  j-th  element  of  A  is  A^j,  the  j-th 
column  is  A  ^  and  the  i-th  row  is  Aj^.  The  unit  vector  having  all  zeros 
except  in  the  i-th  component  will  be  denoted  by  e. . 


1.1  THE  SEQUENTIAL  QUADRATIC  PROGRAMMING  METHOD 


The  sequential  quadratic  programming  method  generates  a  sequence  of 
quadratic  programs  (QP)  that  approximate  the  local  behavior  of  the 
original  nonlinear  program  (1).  The  Hessian  of  this  subproblem  is 
updated  from  iteration  to  iteration  using  one  of  the  variable  metric 
updating  formulas  (Han  [1976] )•  The  solution  of  the  QP  subproblem 
generated  at  each  iteration  determines  the  step  direction  for  that 
iteration,  and  the  multiplier  vector  associated  with  this  solution  to 
the  QP  is  taken  as  an  approximation  to  the  multiplier  vector  of  (1). 

The  following  is  a  brief  overview  of  the  SQP  method. 


Initially,  let  us  assume  that  (1)  has  only  inequality  constraints. 
The  generalization  to  include  equality  constraints  is  straightforward. 
Let  xc  e  Rn  be  the  current  estimate  of  the  solution  to  (1),  uc  e  Rm  the 
current  estimate  of  the  Lagranglan  multiplier  vector  associated  with  the 
solution  of  (1),  and  £(x,  u)  the  Lagranglan  function,  i.e., 


£(x,  u)  -  f(x)  +  u’g(x). 

Let  (x*,  u*)  be  a  Kuhn-Tucker  point  corresponding  to  a  local  minimum  of 
(1);  that  is,  (x*,  u*)  satisfies  the  following: 
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(2a)  7xX(x*,  u*)  -  0, 


(2b)  u*'g(x*)  -  0, 


(2c)  u*  >  0,  and 


(2d)  g(x*)  <  0. 


The  SQP  algorithm  determines  s£  e  Rn  and  sj  e  ^  such  that 


(xc  +  sx,  uc  +  s^)  is  a  first-order  approximate  solution  to  the  equation 


defined  by  (2a)  -  (2d)  .  It  can  be  shown  that  s£  is  the  solution  to  the 


following  quadratic  program: 


(3)  min  Vf(xc)’sx  +  (l/2)sx' 7^Jl(xc ,  uc)sx 


subject  to  g(xc)  +  Vg(xc)'sv  <  0 


and  uc  +  Sy  is  the  corresponding  multiplier  vector  (see  Boggs,  Tolle, 


and  Wang  [1982]) 


The  SQP  iteration  for  a  nonlinear  program  having  both  equality  and 


inequality  constraints  can  now  be  described  as  follows: 


[1]  Given  xc,  a  current  iterate,  and  Q  ,  a  current  approximation 


of  the  Hessian  of  the  Lagrangian,  determine  a  Ruhn-Tucker  point 


(s£,  Uq,  Vq)  of  the  quadratic  program 


a 


m 


min  Vf(xc)'s  +  (l/2)s  'Q  s 


subject  to  gj(xc)  +  7gj(xc)'sx  <0,  j  -  1 . m 

hjt(xc)  +  7hjt(xc),sx  ■  0,  k  -  m  +  1,  . m  +  p. 


{21  Set 


x“  ‘  xc  +  asx,  un  -  uj,  vn  -  vj 


where  a  Is  a  step-length  parameter  chosen  so  that  an  appropriate  penalty 


function  Is  decreased. 


[3]  Update  Qc  so  that  Qn  Is  an  approximation  of 


With  the  above  Iteration  as  a  basis,  the  SQF  method  would  be 
expected  to  have  many  of  the  properties  of  the  well-known  variable 
metric  methods  for  unconstrained  minimization,  since  in  the  absence  of 
constraints,  the  SQP  method  reduces  to  a  variable  metric  algorithm.  For 
the  constrained  case,  however,  the  Hessian  (of  the  Lagrangian)  need  not 
be  positive  definite,  thus  the  standard  positive  definite  updates  such 
as  the  BFGS  or  DFP  (see  Fletcher  [1980]  and  Dennis  and  Schnabel  [1983]) 
may  not  be  appropriate.  To  date,  local  superlinear  convergence  has  not 
been  established  for  the  use  of  positive  definite  updates  except  in  the 


case  where  the  Hessian  of  Che  Lagrangian  at  the  solution  Is  positive 
definite.  (See  Han  [1976]  and  Boggs,  Tolle,  and  Wang  [1982].) 

There  are  nonpositive  definite  choices  for  the  matrices  Qc  that 
will  lead  to  local  superlinear  convergence,  as  shown  by  Wilson  [1963] 
(who  uses  the  Hessian  of  the  Lagrangian  itself)  and  Han  [1976]  .  But  in 
these  cases  the  solution  of  the  quadratic  program  in  step  [1]  is  not 
straightforward;  in  fact,  there  may  be  multiple  solutions.  Nocedal  and 
Overton  [1983]  have  recently  developed  an  updating  scheme  for  equality 
constrained  problems  that  may  help  to  resolve  this  difficulty,  but  its 
application  to  general  problems  is  not  yet  ensured. 

In  light  of  the  absence  of  a  feasible  alternative,  most  optimizers 
implementing  an  SQP-type  algorithm  have  opted  for  using  positive 
definite  updates  of  the  BFGS  or  DFP  type.  The  experimental  results  have 
been  quite  good  (see  Hock  and  Schittkowski  [1981])  despite  the  lack  of  a 
solid  theoretical  underpinning.  The  approach  of  this  paper  is  based  on 
the  same  practical  considerations,  and  hence  a  positive  definite 
updating  scheme  will  be  used. 

A  vital  part  of  an  SQP  algorithm,  as  for  any  algorithm  for  solving 
(1),  is  a  provision  for  forcing  convergence  from  a  remote  starting 
point.  In  the  unconstrained  case,  this  is  accomplished  by  requiring  a 
decrease  in  the  objective  function  at  each  iterate.  For  constrained 
optimization,  reduction  of  the  objective  function  must  be  balanced 


against  satisfying  the  constraints.  This  balancing  act  is  usually 
achieved  by  requiring  a  reduction  in  a  "merit'*  or  "penalty"  function  at 


each  Iteration.  Han  [1977]  proposed  that 


(4)  p_(x)  -  f(x)  +  r  {  T.  [g.(x)]  +  Z  |h 

j-1  J  k-nrt-1 


be  used  as  a  merit  function,  where  r  is  larger  than  the  absolute  value 
of  any  of  the  multipliers  associated  with  the  solution  to  (1).  Under 
reasonable  conditions,  he  shows  that  any  sequence  {(x^,  u^,  v^)} 
generated  using  the  SQP  algorithm  with  a  positive  definite  update  and 
this  merit  function  will  converge  to  a  Kuhn-Tucker  point.  The  algorithm 
developed  for  this  paper  employs  Han's  merit  function,  thus  guaranteeing 
global  convergence  under  the  conditions  imposed  by  Han. 


In  spite  of  ensuring  global  convergence,  this  merit  function  is  not 
entirely  satisfactory.  In  particular,  it  does  not  guarantee  that  full 

lr 

steps,  i.e.,  a  m  1,  will  be  taken  as  x  approaches  x*.  This  may 
restrict  the  convergence  rate  to  be  less  than  superlinear.  Chamberlain, 
Lemarechal,  Pedersen,  and  Powell  [1979],  Boggs  and  Tolle  [1980,  1981], 
and  Bertsekas  [1980,  1981]  have  developed  other  merit  functions  that  do 
allow  for  a  -  1  near  a  solution.  However,  these  merit  functions  are 
computationally  more  complex  than  (4)  and  were  not  considered  suitable 
for  a  large-scale  algorithm. 


V.VWiT 


"Co" 


1.2  STRUCTURE  OF  A  SPARSE  SQP  ALGORITHM 


The  algorithm  presented  here  is  designed  to  solve  large-scale 
nonlinear  programming  problems  having  sparse  Lagrangian  Hessians. 

Because  it  uses  the  SQP  method,  the  algorithm  is  particularly  well 
suited  for  solving  problems  having  nonlinear  constraints.  The  algorithm 
allows  for  any  sparsity  pattern  in  the  Hessian  of  the  Lagrangian,  i.e., 
no  particular  sparsity  pattern  is  assumed.  Linear  constraints  and 
upper-  and  lower-bound  constraints  on  the  variables  are  handled 
explicitly  by  the  algorithm.  The  number  of  constraints  is  theoretically 
unlimited;  however,  an  active  set  strategy  is  used  and  the  number  of 
constraints  active  at  any  given  time  is  constrained  by  the  amount  of 
memory  available.  Also,  the  algorithm  is  more  efficient  when  the  number 
of  active  constraints  is  small  relative  to  the  number  of  variables. 

The  following  discussion  describes  the  structure  of  the  algorithm. 
The  discussion  focuses  on  the  features  of  the  algorithm  that  permit  the 
SQP  method  to  be  applied  to  larger  problems. 


1.2.1  SPARSITY 


The  algorithm  described  here  Is  designed  to  exploit  the  sparsity 
often  found  In  large-scale  problems.  Such  problems  usually  have  sparse 
Lagrangian  Hessians  and  sparse  gradients.  Handling  sparse  gradients  Is 
not  difficult  and  most  algorithms  can  be  easily  adapted  to  do  so. 
However,  the  handling  of  a  sparse  Lagrangian  Hessian  Is  not  so  easy 
because.  In  general,  matrix  operations  do  not  preserve  sparsity.  The 
Hessian  of  the  Lagrangian,  or  an  approximation  of  It,  appears  In  most 
nonlinear  optimization  algorithms  and  will  normally  be  used  as  the 
coefficient  matrix  of  a  system  of  equations  that  Is  used  to  compute  a 
step  direction  as  in  the  SQP  algorithm.  Exploiting  the  sparsity  of  the 
Hessian  of  the  Lagrangian  is  Important  for  two  reasons.  First,  in  large 
problems  using  the  sparse  structure  often  reduces  the  total  amount  of 
computation  required.  Second,  storing  a  sparse  matrix  requires  much 
less  memory  than  storing  a  full  matrix.  Even  if  a  computer  has 
unlimited  capacity  for  storing  the  matrix,  such  as  is  the  case  for 
virtual  memory  machines,  manipulation  of  the  full  matrix  may  cause 
considerable  paging  of  memory  as  different  parts  of  the  matrix  are 
accessed.  The  resulting  I/O  time  for  swapping  the  different  parts  of 
the  matrix  between  core  and  a  high-speed  storage  device  can  be 
expensive. 


We  have  chosen  to  store  the  representation  of  the  Hessian  of  the 
Lagrangian  as  a  lower  triangular  matrix  under  the  assumption  that  the 


approximation  to  the  Hessian  of  the  Lagrangian  will  be  maintained  as  a 
positive  definite  matrix.  Unfortunately,  the  Cholesky  factor  of  a 
sparse  symmetric  and  positive  definite  matrix  need  not  be  sparse. 

George  and  Liu  [1981] ,  however,  describe  an  algorithm  for  permuting  the 
rows  and  columns  of  a  sparse,  symmetric  positive  definite  matrix  that 
significantly  reduces  the  fill-in  that  occurs  in  the  lower  triangular 
Cholesky  factor.  The  authors  also  describe  a  storage  scheme  for  the 
sparse  matrix  and  code  for  solving  systems  of  equations  defined  by  the 
original  matrix  by  performing  forward  and  backward  substitution  on  the 
triangular  factor  to  obtain  the  solution. 

Representing  the  Hessian  approximation  as  the  lower  triangular 
factor  has  other  computational  advantages.  The  next  section  discusses 
the  solution  of  the  dual  to  the  quadratic  program  defined  by  each 
iteration  of  the  SQP  algorithm.  Generation  of  this  dual  problem  is  much 
simpler  if  the  lower  triangular  factor  is  available.  (See  section 
II. 3. 4.) 


1.2.2  SOLVING  THE  QUADRATIC  PROGRAM 


The  quadratic  program  (QP)  generated  at  each  iteration  of  the  SQP 


method  has  the  form 


(5)  min  (l/2)s'Qs  +  q's 
s 

subject  to  As  <  a 

Bs  ■  b  . 

For  a  large-scale,  sparse  problem  Q  will  be  a  large,  sparse  matrix. 
Standard  methods  for  solving  quadratic  programs,  such  as  the  pivoting 
methods  of  large-scale  linear  programming  methods,  either  do  not  take 
advantage  of  the  sparsity  structure  or  are  too  complicated  for  repeated 
use  in  a  nonlinear  programming  code.  The  quadratic  program  generated  by 
the  SQP  method  may  be  infeasible.  By  solving  the  Wolfe  dual  (Wolfe 
[1961])  to  (5)  these  problems  can  be  avoided. 

If  (5)  Is  feasible  and  Q  is  positive  definite,  then  the  Wolfe  dual 
of  (5)  will  be  feasible  and  will  have  a  nonempty  solution  set.  The 
structure  of  this  algorithm  maintains  a  positive  definite  representation 
of  Q.  If  the  set  {s  :  As  <  a,  Bs  *  b  }  is  empty,  the  solution  obtained 
from  solving  the  Wolfe  dual  of  (5)  will  be  a  "least  infeasible”  solution 
of  (5).  (See  appendix  B.)  We  solve  a  transformed  version  of  the  dual 
problem  rather  than  the  primal  problem  because  it  is  a  quadratic  program 


having  only  nonnegativity  constraints  on  variables  that  are  the 


multipliers  corresponding  to  the  inequality  constraints  in  (5).  A 
projected  preconditioned  conjugate  gradient  algorithm  for  solving  the 
transformed  Wolfe  dual  problem  is  given  in  section  1.3. 


There  are  several  advantages  to  solving  the  dual  QP.  If  the  number 
of  active  constraints  is  small  relative  to  the  number  of  variables  in 
the  nonlinear  program,  then  the  dual  problem  will  be  much  smaller  than 
problem  (5).  Moreover,  since  the  solution  to  (5)  is  the  step  direction 
used  by  the  SQP  method,  one  would  expect  the  step  directions  to  change 
significantly  from  iteration  to  iteration,  even  when  close  to  a  solution 
of  the  nonlinear  program.  However,  the  solution  to  the  dual  problem  can 
be  shown  to  be  a  good  approximation  to  the  multiplier  vector  of  the 
nonlinear  program  and  should  not  change  much  from  iteration  to 
iteration.  Consequently,  a  great  efficiency  is  gained  by  using  the 
estimated  multiplier  vector  from  the  preceding  iteration  as  the  initial 
estimate  in  solving  the  dual. 

Choi,  Haug,  Hou,  and  Sohoni  [1983]  report  on  the  use  of  an 
algorithm  developed  by  the  Russian  Pshenichny  [1970]  to  solve  optimal 
design  problems.  Pshenichny's  algorithm  is  similar  to  the  one  described 
here  i.i  that  he  solves  a  sequence  of  quadratic  programs  by  solving  their 
duals.  His  sequence  of  quadratic  programs  is  similar  to  Han's  except 
that  the  matrix  defining  the  quadratic  program  is  always  the  identity, 
i.e.,  no  second  order  information  is  used.  Thus  the  algorithm  is 
significantly  different  from  that  proposed  herein. 


1.2.3  STEP-LENGTH  CONTROL 


Having  determined  a  step  direction  by  obtaining  the  solution  to 
(5),  it  is  necessary  to  take  a  reasonable  step  in  that  direction.  Since 
large  problems  are  being  solved,  the  method  of  controlling  the  step 
length  must  be  relatively  simple.  We  have  chosen  to  use  the  penalty  or 
merit  function  of  Han  [1977] 

(6)  $(a)  *  f(x  +  as) 

m  nrf-p 

+  r  {  £  [g  (x  +  as)]  +  Z  j\.(x  +  as)  I  }  , 
j-1  2  k-nrt-1 

m 

where  x  is  the  current  estimate  of  the  solution  to  the  NLP  and  s  is  the 
step  direction.  The  step-length  parameter  is  a.  The  scalar,  r,  is 
chosen  to  be  larger  than  the  largest  multiplier  in  absolute  value.  A 
step,  as,  will  be  taken  if  an  a  can  be  found  that  produces  an  acceptable 
decrease  from  <J>(0)  to  <J>(a).  The  details  can  be  found  in  section  1.4. 

1.2.4  UPDATING  PROCEDURES 

The  standard  SQP  method  maintains  an  approximation  to  the  Hessian 
of  the  Lagrangian  which  is  updated  after  each  iteration  using  one  of  the 
well-known  matrix  updating  methods.  One  of  the  more  common  methods  is 
the  BFGS  updating  method  (Dennis  and  More  [1977]).  The  algorithm  given 
here,  however,  maintains  a  sparse  representation  of  the  approximation  to 


Che  Hessian  which  Che  sCandard  BFGS  updating  scheme  does  noc  do. 
Therefore,  we  have  chosen  a  meChod  of  Goldfarb  [1976]  for  updacing  Che 
Cholesky  facCor  of  a  posicive  definice  maCrix.  Fill-in  is  ignored  in 
applying  Che  meChod.  If  fill-in  were  allowed,  Che  mechod  would  produce 
Che  scandard  BFGS  updace  for  Che  Cholesky  facCor.  DeCails  can  be  found 
in  secCion  1.5. 

1.2.5  A  BASIC  ITERATION  OF  THE  ALGORITHM 

The  following  is  a  descripcion  of  a  basic  iCeraCion  of  Che 
algorichm  developed  here.  The  algorichm  uses  an  acCive  sec  scracegy. 
EquallCy  conscrainCs  are  always  acCive  and  inequaliCy  consCrainCs  are 
acCive  ac  Che  currenC  icerace  if  chey  are  infeasible  or  nearly  so  ac 
chaC  poinc.  Upper-  and  lower-bound  conscrainCs  are  creaced  as  general 
inequaliCy  conscrainCs  in  Che  descripcion  of  a  basic  iceracion.  The 
accual  implemenCaCion,  as  described  in  Part  II,  explicitly  handles 
upper-  and  lower-bound  constraints. 

[1]  Solve  the  transformed  QP  for  the  multiplier  vector.  Let 
(xc,  uc,  vc)  and  Lc  be  the  current  estimate  of  the  Kuhn-Tucker 
vector  and  the  approximate  lower  triangular  factor  of  the  Lagrangian 
Hessian,  respectively.  The  QP  generated  by  the  standard  SQP  method  is 


(7)  min  Vf(xc)’s  +  (l/2)s 'LCLC' s 


s  e  Rn 


subject  to:  Vg(xc)’s  +  g(xc)  <  0 

7h(xc) ' s  +  h(xc)  -  0 


Let  g  be  the  vector  of  active  inequality  constraints  —  including  active 
upper-  and  lower-bound  constraints.  The  solution  to  (7)  and  the 
associated  multiplier  vectors  are  obtained  by  transforming  (7)  into 


(8)  min  (1/2) (u' ,  v')  K  (u' ,  v')'  +  (u' ,  v')  k 
u  e  Rm' 


subject  to  u  >  0 


where 


K  -  (Vg(xc),  VMx^Vl:1  L:1(7I(xc),  7h(xc) ) 


k  -  (Vg(xc),  7h(xc))’L^1'L-1Vf(xc)  -  (g(xc) ' ,  h(xc)’)’, 


and  m'  is  the  number  of  active  inequality  constraints,  including  active 
upper-  and  lower-bound  constraints.  (The  number  of  active  inequality 
constraints,  m’ ,  may  change  from  iteration  to  iteration.)  Let  p'  ■  m'  + 
p.  Then  K  is  the  p'  x  p'  matrix  defined  by  K  *  M' M,  where 


M  -  L~1(7l(xc) ,  7h(xc) ) 


The  matrix  M  can  he  computed  by  p*  forward  substitutions  using  the  lower 
triangular  matrix  Lc*  Once  M  is  formed,  another  forward  substitution 
produces  q  *  L~^7f(xc)  so  that  k  ■  M'q  -  (g(xc)',  h(xc)')'. 

A  projected  preconditioned  conjugate  gradient  algorithm  is  employed  to 
solve  (8). 

Let  (u11',  vn')’  be  the  solution  to  (8). 

[2]  Solve  for  the  step  direction.  Let  sc  be  the  solution  to 

LcL£s  »  -  (7f (xc)  +  7g(xc)un  +  Vh(xc)vn} . 

(If  inequality  constraint  gj  is  considered  inactive,  then  uj  is  set  to 
0.)  If  QP  (7)  is  feasible,  then  sc  is  its  solution.  If  QP  (7)  is  not 
feasible,  then  an  sc  is  obtained  which  is  a  point  of  minimal 
infeasibility.  Note  that  only  one  forward  substitution  and  one  backward 
substitution  are  required  to  solve  for  s. 

[3]  Compute  the  step-length.  Let  r  >  max  (|u?|,  |v?| } 


<Ka)  *  f(xc  +  asc)  +  r{  £  [g^x0  +  asc)]+ 

i-1 

p 

+  £  | h.(xc  +  asc) I } . 

j-l  J 

Choose  ac  such  that  0  <  ac  <  1.0  and  ac)  is  sufficiently  smaller  than 
<t>(0).  Set  xn  *  xc  +  acsc. 

[4]  Check  for  termination.  Compute  the  gradient  of  the  Lagrangian 
function  at  (x11,  un,  vn)  ,  VxX(xn,  un,  vn) ,  and  terminate  successfully  if 
|vxJl(xn,  u11,  v11)  |  is  small  relative  to  the  objective  function  value. 

[5]  Update  the  triangular  factorization.  Update  Lc  but  maintain 
the  sparsity  structure.  Use  the  BFGS  updating  procedure  for  the 
Cholesky  factor  of  a  positive  definite  Hessian  developed  by  Goldfarb 
[1976]  with  a  modification  that  preserves  the  sparsity  pattern  in  Lc. 

[6]  Go  to  [1]  for  the  next  iteration. 

The  following  sections  of  Part  I  discuss  these  steps  in  more 


detail.  Part  II  describes  the  implementation  of  the  algorithm  and 
provides  many  of  the  details  not  given  here,  such  as  what  to  do  if  no 
acceptable  step  is  possible  in  the  direction  sc. 


1.3  THE  QUADRATIC  PROBLEM 


Transforming  the  general  quadratic  program  defined  In  (7)  Into  the 
quadratic  program  having  only  nonnegativity  constraints  on  some  of  the 
variables  (8)  has  two  advantages.  First,  if  the  number  of  active 
constraints  in  the  original  problem  is  small  relative  to  the  number  of 
variables,  then  the  transformed  QP  will  be  much  smaller  than  the 
original  and  will  have  simple  constraints.  Second,  solving  the 
transformed  problem  with  a  conjugate  gradient  method  has  proved  to  be 
very  efficient  (see  Part  II) .  This  is  especially  true  when  near  a 
solution  as  the  Initial  estimate  of  the  solution  to  the  transformed 
problem  will  be  close  to  the  multipliers  for  (1)  and  will  not  change 
much  from  iteration  to  iteration.  The  conjugate  gradient  algorithm  will 
therefore  have  to  do  very  little  work  to  refine  the  estimates  on  each 
iteration.  In  constrast,  the  solution  to  the  general  quadratic  program 
(7),  being  the  step  direction,  will  change  significantly  from  iteration 
to  iteration.  Hence,  using  the  step  direction  from  the  preceding 
iteration  as  the  initial  estimate  of  the  solution  will  not  improve 
computational  efficiency.  If  the  original  QP  (7)  is  infeasible,  the 
transformed  problem  is  still  feasible  but  unbounded,  though  it  can  be 
made  strictly  convex  by  a  simple  adjustment.  Infeasibility  of  the 


original  QP  should  occur  only  when  far  from  a  solution  to  (1),  and  it 
will  be  shown  that  the  step  direction  computed  using  the  approximate 
multipliers  obtained  from  the  adjusted  version  of  (8)  will  allow  the 
algorithm  to  continue  making  progress. 

Problem  (7)  could  be  solved  using  one  of  the  pivoting  algorithms 
(see  Dantzig  [1963]  and  Beale  [1967]).  These  algorithms,  however,  are 
not  particularly  useful  for  solving  large,  sparse  problems  because  they 
destroy  sparsity.  Thus,  they  are  not  considered  useful  for  solving  (7) 
in  the  context  addressed  here.  They  were  also  not  considered  for 
solving  the  transformed  problem  (8)  even  though  these  problems  should  be 
smaller  than  (7)  and,  possibly,  denser.  The  reason  is  that  the  pivoting 
methods  cannot  be  used  to  refine  an  estimate  of  a  solution  that  is 
already  close  to  the  desired  result.  Pivoting  methods  do  not  start  with 
an  estimate  of  the  solution  so,  unlike  the  CG  methods,  they  do  not 
exhibit  a  decrease  in  computation  when  a  good  estimate  of  the  solution 
is  already  available. 

1.3.1  THE  WOLFE  DUAL 

Suppose  Q  is  a  positive  definite  n  x  n  matrix,  A  and  B  are, 
respectively,  m  x  n  and  p  x  n  matrices,  and  q,  a,  and  b  are  fixed 
vectors  of  appropriate  dimension.  The  general,  strictly  convex 
quadratic  problem  has  the  form 


(9)  min 


(l/2)s'Qs  +  q's 


V  ■. 


• rv.' 


r<Vv-’V' 


KM 

& 


(11)  min  (l/2)w'Kw  +  k'w 

w  e  R^P 


subject  to  u  >  0, 


where  K  -  VQ_1V' ,  V  ■  (A',  B')',  and  k  ■  VQ-1q  +  v.  Here 


(a',  b' )'.  Note  that  K  will  be  positive  definite  if  and  only  if  V 


has  full  row  rank,  else  it  will  be  positive  semi-definite.  In  the 


former  case  the  following  theorem  is  well  known. 


Theorem  1:  Suppose  V  ■  (A1,  B')'  has  full  row  rank,  then  both  (9)  and 


(11)  have  unique  solutions,  say  s*  and  w*  *  (u*’,  v*')',  w*  is  the 


multiplier  vector  for  (9),  and 


(12)  s*  -  -Q~1[A'u*  +  B'v*  +  q]. 


Sometimes,  however,  there  are  enough  inequality  constraints  so  that 


m  +  p  >  n.  Then  V  cannot  have  full  row  rank  and  the  above  result  does 


not  apply.  However,  if  (9)  is  feasible,  we  have  the  following  result. 


(See  Wolfe  [1961]  .) 


Theorem  2;  Suppose  (9)  is  feasible.  Then  problem  (9)  has  a  unique 


solution  s*  and  problem  (11)  has  a  nonempty  solution  set  W*.  Moreover, 


for  any  w*  ■  (u*'f  v*')'  e  W*,  equation  (12)  holds.  If  (9)  is 


& 


infeasible,  then  problem  (11)  is  unbounded  and  has  no  solution. 

In  the  application  of  problem  (9)  in  the  algorithm  given  here,  it 
is  possible  that  the  quadratic  problem  may  be  infeasible.  In  this  case, 
the  following  result  will  be  applicable. 

Consider  the  perturbed  version  of  (11): 

(13)  min  (l/2)w'(K  +  el)w  +  k'w 

w  e  R^P 

subject  to  u  >  0 

where  e  is  a  small  positive  number.  Since  K  +  el  is  a  positive  definite 
matrix,  problem  (13)  has  a  unique  solution  we  *  (ue,  ve).  We  denote  by 

(14)  sE  -  -Q“^[A'ue  +  B've  +  q] . 

For  a  given  s  vector  we  measure  its  infeasibility  in  the  original 
quadratic  problem  (9)  by 

e(s)  -  { | [ As  -  a]+| 2  +  |Bs  -  bj^}1/2. 

Then  e(s)  *  0  if  and  only  if  s  is  feasible  for  (9).  The  set  of  least 
infeasible  points  is  denoted  by 


Clearly,  Z  is  a  convex,  closed,  nonempty  subset  of  R  .  If  (9)  is 
feasible,  it  is  exactly  the  feasible  set. 

Theorem  3:  Let  {we}  be  the  family  of  solutions  to  (13)  for  positive 
values  of  s,  and  for  each  e  let  se  be  given  by  (14).  Then 

lim  se  ■  s 

e  -*>  0+ 

A 

where  s  is  a  solution  of  the  problem 

(15)  min  (l/2)s'Qs  +  q's. 

s  e  Z 

Pf :  See  Appendix  B. 

In  the  algorithm  presented  for  solving  problem  (1),  Q  is  the 
updated  approximation  of  the  Hessian  matrix,  which  is  positive 
definite.  A  is  taken  to  be  the  gradients  of  the  active  inequality 
constraints  and  B  the  gradients  of  the  equality  constraints  at  the 
current  iterate  xc.  By  active  constraint,  we  include  all  of  the 
equality  constraints  and  any  inequality  constraints  for  which 
gj(xc)  >  — n,  where  p  >  0  is  a  prescribed  tolerance.  Assuming 
feasibility  and  nondegeneracy  for  the  original  problem,  the  quadratic 
programs  to  be  solved  will  likely  have  less  than  full  row  rank  only  when 


the  approximation  Is  far  from  feasibility.  In  this  case,  the  perturbed 
dual  problems  will  be  solved  with  small  e.  Theorem  3  provides 
justification  for  this  procedure  in  that  it  ensures  that  the  solution  s£ 
will  be  a  step  toward  minimum  infeasibility. 


1.3.2  THE  CONJUGATE  GRADIENT  ALGORITHM 


Before  considering  the  application  of  the  conjugate  gradient  method 
to  the  minimization  of  a  quadratic  function  subject  to  nonnegativity 
constraints,  we  should  review  some  of  its  properties  when  applied  to  the 
unconstrained  minimization  of  a  quadratic  function.  The  conjugate 
gradient  method  for  solving 

(16)  min  F(w)  ■  (l/2)w'Kw  +  k’w 
w  e  Rn 

is  as  follows: 

[0]  Starting  at  any  w°  e  Rn  set  X  -  0,  and  define 
p°  -  -7F(w°)  -  -Kw°  -  k. 

[1]  Set  ■  w*  +  ctjjP*, 

VF(wV  p* 


where  a. 


[2]  If  VFCw*’’1)  *  0,  set  w*  ■  wx,"t'i  and  terminate  with  w*  as  the 


solution  to  (16).  Otherwise,  go  to  [3] 


[3]  Set  p*+1  -  -VFCw*1"1)  +  p 


JW-1> 


,  „  TFCw^1)’  K  p* 

where  8*  " - P  * 

p  '  K  p 


[4]  Set  l  -  A  +  1.  Go  to  [1] 


In  exact  arithmetic,  the  algorithm  terminates  in  at  most  n  iterations 
for  positive  definite  K.  The  conjugate  gradient  algorithm  converges 
monotonically  to  w*  in  that  if  we  define 


E(w)  *  (w  -  w*) '  K  (w  -  w*) 


then  it  is  easy  to  show  that 


E(w*+*)  »  E(w*)  -  — E-J —  >  E(w^). 

P  *  K  P 

Thus,  in  the  metric  defined  by  the  positive  definite  matrix  K,  the 
conjugate  gradient  estimates  get  closer  to  the  solution  on  each 
iteration.  Likewise,  it  is  easy  to  show  that 

-  F(»«)  -  -  I  . 


showing  that  the  objective  function  is  decreased  on  each  iteration.  It 
should  be  noted  that  solving  (16)  for  positive  definite  K  is  equivalent 
to  solving  Kw  +  k  ■  0  for  w. 

If  we  let  r*  *  Kw*  +  q,  we  can  think  of  the  CG  algorithm  as  either 
trying  to  find  a  zero  of  the  gradient  or  trying  to  make  the  residual 
associated  with  the  linear  equation,  r,  equal  to  zero.  Unlike  most 
methods  for  solving  linear  systems  of  equations,  the  CG  method  does  not 
alter  the  matrix  K  and  involves  only  matrix-vector  multiplications. 

The  finite  termination  property  and  the  monotone  decrease  in  the 
distance  between  w*  and  w*  as  defined  by  the  matrix  norm  are  also 
achieved  by  a  modification  to  the  conjugate  gradient  algorithm  (see 
Polyak  [1969])  which  minimizes  a  quadratic  function  subject  to 
nonnegativity  constraints  on  the  variables. 

Let  y  -  Kw  +  k.  Then  w*  solves  (16)  if  and  only  if 

(17)  y*  >  0  if  w*  *0  and 

(18)  y*  -  0  if  w*  >  0. 

These  are  the  Kuhn-Tucker  conditions  for  the  solution  to  (16).  Another 
way  of  stating  these  conditions  is  to  say  that  w*  solves  (16)  if  and 
only  if 


(19)  w*'y*  ■  0  (complementarity  condition)  and 

(20)  w*,  y*  >0  (nonnegativity  condition). 

Polyak's  algorithm,  which  is  the  basis  of  the  algorithm  developed  by 
O'Leary  [1981],  maintains  nonnegativity  of  the  vector  iterates  w*  while 
iterating  toward  satisfying  the  remaining  conditions  in  (19)  and  (20). 
Polyak's  algorithm  terminates  in  a  finite  number  of  iterations  (O'Leary 
[1981]). 

1.3.3  A  PROJECTED  PRECONDITIONED  CONJUGATE  GRADIENT  (PPCG)  ALGORITHM 

O'Leary  [1981]  describes  a  modification  to  Polyak's  algorithm  that 
preconditions  the  CG  step  to  improve  the  convergence  rate  of  the 
algorithm.  She  also  proves  that  her  algorithm  converges  in  a  finite 
number  of  iterations.  O'Leary's  algorithm  has  been  modified  for  the 
work  described  in  this  application.  The  standard  CG  step  is  projected 
onto  the  feasible  region  as  opposed  to  O'Leary's  method  of  truncating 
the  step  at  the  boundary  of  the  feasible  region.  Taking  projected  steps 
has  the  advantage  of  allowing  more  than  a  single  variable  to  become 
inactive  on  an  iteration,  while  truncation  will  permit  only  one  variable 
to  become  inactive  on  a  single  Iteration. 


A  preconditioned  conjugate  gradient  algorithm  for  (11)  uses 
M”^7F(w*)  rather  than  VF(w*)  to  define  in  step  [3]  of  the  CG 

algorithm,  where  M-^  is  some  approximation  to  K”^.  One  obvious  choice 
for  M'1  is  the  inverse  of  the  diagonal  of  K.  Another  choice,  and  the 
one  actually  employed  here,  is  one  pass  of  the  symmetric  successive 
over-relaxation  (SSOR)  method  as  applied  to  the  system  of  equations 
defined  by  Kw  +  k  ■  0.  The  following  is  a  description  of  the  PPCG 
algorithm. 

In  our  description  of  the  PPCG  algorithm,  N  will  be  the  set  of 
indices  of  components  of  w  that  are  constrained  to  be  nonnegative  and  N 
will  be  the  set  of  remaining  indices.  A  vector  w  will  be  a  feasible 
solution  to  the  quadratic  program  if  w^  >  0  for  i  e  N.  At  any  given 
time,  a  variable  will  be  considered  active  or  inactive.  Only 
constrained  variables  can  be  inactive.  An  inactive  variable  will  always 
be  at  its  bound,  i.e.,  equal  to  zero.  A  variable,  however,  may  be  at 
its  bound  but  not  considered  inactive. 

The  PPCG  Algorithm 


[0]  (Initialization)  Choose  w°  such  that  w°  >  0  for  i  e  N  and 


[1]  (Outer  iteration)  Set  A  ■  A  +  1,  y* 


■  Kw*  +  k,  and  I^_^  *  I. 
Define  I^  *  (i  e  N:  w*  ■  0  and  y^  >  0}.  If  1^  *  *ji-1  anc*  l^il  ^  e  f°r 
i  i  Ij£,  then  terminate;  otherwise,  set  I  *  I^.  (Note:  e  is  the 
tolerance  on  the  norm  of  the  residuals  for  termination.) 


[2]  (Inner  iteration)  The  inner  iteration  only  manipulates  active 
variables.  During  the  inner  iteration,  variables  that  are  active  may 
become  inactive,  but  no  inactive  variables  become  active.  Inactive 
variables  can  become  active  only  during  execution  of  step  [1].  Let  J  be 
the  set  of  indices  of  active  variables,  then  any  variable  index  belongs 
either  to  I  or  J.  The  matrix  system  is  partitioned  as  follows: 


w*  + 


w* 

k 

K_, 

K' 

I 

o 

,  k  ->• 

I 

,  K  - 

II 

JI 

X 

k 

K„ 

k 

J 

J 

JI 

JJ 

Initialize  to  solve  the  equation 


KJJ  WJ  “  "kJ  "  KJI  WI* 

Set  z°  m  Wj  and  r°  =*  -kj  -  Kjj  w£  -  Kjj  z°.  Go  to  step  [4]. 

[3]  (Restart  inner  iteration)  If  a  projected  step  was  taken,  the 
variables  that  have  been  set  to  their  bounds  must  be  checked  to  see 


whether  they  should  become  inactive  variables.  Also,  the  residual 


vector  must  be  recomputed  since  the  CG  formula  for  updating  the  residual 

is  not  valid  for  a  projected  step.  Set  z°  ■  wj1  and 

r°  -  -kj  -  Kjj  wj  -  Kjj  z°.  For  each  i  e  J  Pi  N  do  the  following: 

if  z°  ■  0  and  r°  <  0,  then  add  i  to  I.  Repartition  as  necessary.  If 

there  are  no  active  variables,  then  restart  the  outer  iteration  (go  to 


[4]  (Calculate  new  iterate  and  residual)  Set 


o'  o 
r  r 

o '  o 

r  kjj  r 


Set  2T  ■  z°  +  aCG  r°.  For  each  l  e  JON,  do  the  following:  set 
zi  "  fzil+  and  set  a  projection  flag  if  this  is  a  projected  step. 

(Note:  O'Leary's  algorithm  allows  only  one  variable  per  iteration  to 
become  inactive,  whereas  this  one  may  set  more  than  one  to  the  inactive 
state  on  a  single  iteration.) 


If  this  is  a  projected  step,  set  w^  ■  z^  and  go  to  [3] .  Otherwise, 
set  r1  -  r°  -  aCQ  Kjj  r°. 


[5]  If  |  r^-  |  <  e,  set  Wj  *  z*  and  restart  the  outer  iteration  (go 
to  [1]). 


[6]  (Initialize  preconditioned  iteration)  Choose  M  as  a 
preconditioning  matrix  for  Kjj,  set  q  -  1,  and  let  p'  -  M-1r' . 

[7]  (Calculate  new  iterate  and  residual)  Set 


For  each  i  e  JflH,  do  the  following:  set  zj+1  -  [z?+1-]+  and  set  the 
projection  flag  if  this  is  a  projection  step.  If  this  is  a  projected 
step,  set  wj-  »  zq+1  and  go  to  [3],  Otherwise,  set 

rq+1  -  rq  -  Sqj  Kjj  pq. 

[8]  If  |  rq+*  |  <  e,  set  w^-  -  zq+*  and  restart  the  outer 
iteration  (go  to  [1]). 

[9]  (Calculate  new  search  direction)  Set 
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Kti  M'1  r’*1  r("+1)’  if1  r^1 


pq  -  M~^r^+1  + 


q  -  q  +  1. 


Go  to  [7]  for  the  next  preconditioned  step. 


O'Leary's  algorithm  does  not  take  projected  steps;  instead,  a  step 
is  truncated  at  the  first  boundary  it  encounters.  If  M  is  set  to  the 
identity  in  O'Leary's  algorithm,  then  the  algorithm  is  identical  to 
Polyak's.  As  long  as  the  preconditioning  matrix,  M,  is  positive 
definite,  O'Leary's  algorithm  converges  after  a  finite  number  of 
Iterations.  The  projected  version  of  this  algorithm  has  performed  well 
on  the  problems  used  for  testing  the  algorithm  developed  in  this  paper; 
however,  a  thorough  investigation  of  its  properties  remains  to  be  done. 


Possible  choices  for  M  include  the  diagonal  of  KJJ*  M  is  clearly 
positive  definite  in  this  case  if  K  is  positive  definite,  so  for 
O'Leary's  algorithm  the  finite  convergence  property  will  hold.  Another 
choice,  which  O'Leary  investigated,  is  to  define  M~^r  as  follows.  Let 


where  z  is  the  vector  obtained  by  applying  one  iteration  of  the 
symmetric  successive  over-relaxation  (SSOR)  algorithm  to  system  (21) 
with  Z*  as  the  current  estimate  of  Wj.  In  O'Leary's  version  of  this 
application  of  the  SSOR  method,  variables  are  truncated  during  the 
forward  and  backward  passes.  The  M  corresponding  to  this  process  is  not 
necessarily  positive  definite,  soothe  properties  of  the  algorithm  are 
unknown.  However,  O'Leary  reported  good  results  with  this 
preconditioning  method.  For  the  projected  algorithm  described  here,  the 
variables  are  not  truncated  during  the  forward  and  backward  passes  of 
the  SSOR  method.  Consequently,  the  preconditioning  matrix  M  defined  by 
this  process  is  positive  definite  (see  Hageman  and  Young  [1981])  and 
should  make  the  investigation  of  the  properties  of  the  projected  CG 
algorithm  easier.  The  following  is  a  description  of  the  forward  and 
backward  passes  of  one  iteration  of  the  SSOR  method. 


1.4  MERIT  FUNCTION  FOR  STEP-LENGTH  CONTROL 


We  use  Han’s  merit  function  for  step-length  control.  Suppose  the 
current  iterate  is  xc  and  the  step  vector  is  sc,  then  let 
<J>(a)  *  pr(xc  +  asc),  where  pr(.)  is  given  by  (4)  and  a,  0  <  a  <  1,  is 
the  step-length  parameter.  We  set  x11  »  xc  +  asc  if 

<(»(a)  <  4>( 0)  +  a  a<J>'  (0),  where  0  <  a  <  1.  The  derivative,  <(>’  (0) ,  is  the 
right-hand  derivative  of  t  at  0  and  is  computed  as  follows. 


0:  g.j(xc)  <  0  or  [gj(xc)  -  0 
and  7gj(xc)*sc  <  0] 
7gj(xc)’sc:  otherwise 


Define 


for  inequality  constraints,  and 


Hk  - 


for  equality  constraints.  Then  $'(0)  is  given  by 

m  nrt-p 

(22)  <0*(O)  -  7f(xc)'s  +  r  {  Z  G.  +  Z  H.  }  , 

j-1  J  k-nH-1 

where  r  is  larger  than  the  absolute  value  of  any  of  the  multipliers. 

The  choice  of  a  determines  the  strictness  of  the  test.  Normally,  a 
is  set  to  0.1.  Note  that  in  general,  $(.)  is  a  continuous,  but  not 
necessarily  smooth,  function  of  a.  It  is  still  the  case,  however,  that 

A  A 

for  some  a  e  (0,  a)  with  0  <  a  <1  the  test  can  be  passed  if  s  is 
really  a  descent  direction  for  pr(.). 

Han's  [1977]  algorithm  differs  from  the  one  developed  here  in  that 
the  new  iterate,  x^+^,  is  given  by 

xk+l  .  xk  +  akgk 

for  any  in  [0,  5]  satisfying 

pr(x1t+1-)  <  min  pr(xk  +  ask)  + 

0  <  a  <  C 


-  Vhk(xc)'sc:  hk(xc)  <  0, 
Vhk(xc)’sc:  hk(xc)  >  0, 

| Vhk(xc) 'sc| :  hk(xc)  -  0, 
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with  {e^}  a  sequence  of  numbers  satisfying 


k-0 

and  C  is  some  positive  number.  Han  shows  that  his  algorithm,  with  a 
proper  choice  of  r  for  the  merit  function  (4),  is  globally  convergent 
under  the  following  conditions: 

(i)  f,  gj,  j  ■  1,  . . . ,  m,  are  continuously  differentiable; 

(ii)  f  is  strictly  convex  and  bounded  below; 

(iii)  the  constraint  functions  are  convex; 

(iv)  the  set  X  »  {x:  g(x)  <  0}  is  compact  and 
X°  -  {x:  g(x)  <  0}  *  t  ; 

(v)  there  exist  positive  numbers  and  \2  such  that  for  each  k 
and  for  any  x  e  Rn, 

x'x  <  x'Qkx  <  X2  x’x. 

This  result  can  be  extended  to  include  equality  constraints.  In  the 
same  paper,  Han  proves  a  weaker  global  convergence  property  requiring 
only  conditions  (i)  and  (v)  and  that  each  quadratic  program  generated  by 
the  algorithm  have  a  Kuhn-Tucker  point  with  a  Lagrange  multiplier  vector 
bounded  by  r  in  «-norm.  Note  that  these  global  convergence  results 
require  only  that  the  Hessian  approximation  matrices,  Qk,  be  positive 
definite,  with  their  eigenvalues  bounded  above  and  below.  The  step- 


length  control  parameter  used  In  the  algorithm  developed  here  uses  the 
same  merit  function  as  Han,  but  we  do  not  require  the  nearly  exact 
minimization  over  a  as  Han  does*  Instead,  we  require  the  step  to 
achieve  a  rate  of  descent  compatible  with  the  local  behavior  of  the 
problem  functions.  This  approach  is  similar  to  the  Goldstein-Armijo 
principle  (see  Fletcher  [1980])  and  has  performed  well  in  testing  (see 


Part  II) 


1.5  SPARSE  UPDATING  PROCEDURE 


Variable  metric  algorithms  for  unconstrained  minimization  update 
approximations  to  the  Hessian  in  such  a  way  that  the  quasi-Newton 
condition  is  satisfied.  Let  f  be  the  objective  function  of  an 
unconstrained  minimization  problem  and  let  Qn  be  the  updated 
approximation  to  the  Hessian.  Then  the  updating  procedure  used  to 
obtain  Q  satisfies  the  quasi-Newton  condition  if 


Qn(xn  -  xc)  -  Vf(xn)  -  Vf(xc) 


Updates  satisfying  the  quasi-Newton  condition,  such  as  the  BFGS  update, 
have  many  desirable  properties,  including  superlinear  convergence  (see 
Fletcher  [1980]).  If  the  Hessian  of  f  is  sparse,  it  is  advantageous  if 
the  updating  procedure  maintains  the  sparsity  pattern.  Shanno  [1980] 
has  shown,  however,  that  it  is  not,  in  general,  possible  to  have  an 
updating  scheme  that  satisfies  the  quasi-Newton  condition,  maintains  a 
given  sparsity  pattern,  and  preserves  positive  definiteness. 


The  same  problem  arises  for  Che  constrained  problem  where  che 
quasi-Newton  condition  is 

(23)  Qn(xn  -  xc)  -  Vx!(xn,  un>  vn)  “  7xA(xc,  u11,  vn) . 

As  discussed  earlier,  the  solution  of  the  quadratic  subproblems  requires 
that  the  updating  scheme  be  positive  definite.  Moreover,  for  solving 
large  problems,  maintaining  the  sparsity  pattern  of  the  Lagrangian 
Hessian  is  essential.  Therefore,  the  requirement  that  the  update 
satisfy  (23)  has  been  dropped  in  favor  of  maintaining  the  sparsity 
pattern.  The  update  is  forced  to  have  the  desired  sparsity  pattern  by 
"zeroing  out"  the  appropriate  elements  in  the  lower  triangular  factor  of 
the  update.  The  effect  of  this  decision  on  the  local  convergence  rate 
is  unknown  even  in  the  unconstrained  case.  Test  results,  however,  have 
been  encouraging.  Thapa  [1983]  reports  favorable  results  for  this  type 
of  procedure  applied  to  the  BFGS  update  for  unconstrained  optimization 
as  long  as  the  updated  factor  remains  positive  definite. 

Since  the  algorithm  developed  here  maintains  a  sparse  lower 
triangular  approximation  to  the  Cholesky  factor  of  the  Hessian,  a 
procedure  for  updating  the  lower  triangular  approximation  is  used.  The 
procedure  is  a  modification  of  Goldfarb's  [1976]  BFGS  procedure  for 
updating  the  Cholesky  factor  of  a  positive  definite  Hessian 
approximation.  The  procedure  is  simple  to  implement.  Other  methods  for 
updating  a  sparse  Hessian  approximation  have  been  developed  by 


Shanno  [1980]  and  Toint  [1977].  These  methods  were  not  used  because  of 
their  complexity  and  because  they  are  not  directly  applicable  to 
updating  a  lover  triangular  factor. 


Goldfarb  Updating  Procedure 

Let  sn  *  xn  -  xc  and  let  Lc  be  the  current  approximation  to  the 
lower  triangular  factor  of  the  Hessian  of  the  Lagrangian,  i.e.,  let 
Qc  *  LCLC' .  The  BFGS  update  for  Qc  is  given  by 


,  Q  snsn,Q 

Q  -  Q  +  - _£  , 

n  C  Sn,y  sn*Qcsn 


where  y  -  V  A(xn,  u11,  v°)  -  V  Jl(xc,  un, 


q  -  y/[(sn,y)(sn,Q  sn)]1//2  -  Qcsn/s 


Define  z  and  w  by  Lcz  *  q  and  w  ■  Lc'p. 


Qn  -  (I  +  qp')LcLc’(I  +  pq’) 


-  L  (1  +  zw')(I  +  wz* )L- ' . 


We  wish  to  find  L  such  that  (I  +  zw' )  ■ 


vn) .  Let  p  ■  sn  and 
n,Qcsn. 

Then 


LQ* ,  with  L  lower  triangular 


and  S2  orthogonal.  Then 

Qn  -  LCL  Q*Q  L'LC'  -  LCL(LCL  )' 


so  that  LQ  -  LfiL  is  the  new  approximation  to  the  Cholesky  factorization 
of  the  Lagrangian.  Ln  will  not  necessarily  have  the  same  sparsity 
structure  as  Lc;  however,  it  is  simple  to  ignore  fill-in  in  Ljj.  If 
w'z  ■  -1  so  that  I  +  zw'  is  singular,  then  we  cannot  update.  However, 
w'z  *  -1  occurs  only  if  s'y  ■  0.  If  s'y  <  0,  a  modification  suggested 
by  Powell  [1978]  is  used  that  maintains  the  positive  definiteness  of  the 
update  (see  Part  II). 


The  L  sought  is  given  by 


Pi*,  i  -  j 

Pj  wi  +  Yj  i  >  J 

0:  otherwise 


for  i,  j  ■  1,  ...,  n.  The  following  two  recurrences  generate  the 
vectors  3,  y,  p. 


Recurrence  1: 


1.  Set  $n  ■  l/wn« 


2.  For  j  ■  n-1,  n-2,  ...,  1  set 


n 


ft 


ft 


WswWRvK 


( 


Pj+l  ■  sj  Pj  +  cj  Pj+l 


flj+1  "  ^j+1  wj+l  +  Yj+1  zj+l 


3*  Set  Pn  “  Pn 


Since  ■  LCL,  the  j-th  column  of  LQ  is  given  by 


(V.j'Jj  VV.k 


,pj  (Lc).j  +  ^  Oj  wk  +  Y j  Zk^Lc^.k»  J  ■  1»  •••»  n* 


Thus,  Lc  can  be  updated  without  explicitly  forming  L.  Only  the  vectors 
p,  S,  y,  w,  z  need  be  stored.  Any  fill-in  is  ignored. 
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II.l  INTRODUCTION 


This  part  describes  the  computer  implementation  of  the  algorithm 
and  gives  the  results  obtained  by  applying  the  algorithm  to  several  test 
problems.  An  assessment  of  the  capabilities  of  the  algorithm  based  on 
the  test  results  is  also  given. 

The  computer  implementation  of  the  algorithm  is  referred  to.as 
OPCON.  The  discussion  here  focuses  on  those  issues  that  are  of 
particular  importance  in  implementing  the  algorithm  on  a  computer. 

The  notation  used  in  this  part  will  be  the  same  as  in  Part  I,  with 
the  exception  that  references  to  equations  in  Part  I  will  be  preceded  by 
"I,"  e.g.,  (1.16)  will  refer  to  equation  16  in  Part  I. 


II. 2  PRELIMINARIES 


Before  discussing  Che  computer  implementation,  it  would  be  helpful 
to  describe  how  the  implementation  handles  function  evaluation,  gradient 
storage,  and  gradient  estimation. 


II. 2.1  FUNCTION  EVALUATION  SUBROUTINE 


The  user  must  supply  a  subroutine  that  computes  the  value  of  the 
objective  function  and  all  of  the  nonlinear  constraint  functions  at  a 
point  passed  to  the  subroutine.  Linear  constraints  are  handled  by 
providing  the  coefficients  as  part  of  the  gradient  data.  Each  call  to 
the  function  evaluation  routine  results  in  the  evaluation  of  the 
objective  function  and  all  of  the  nonlinear  constraints. 

A  smoother  version  of  the  OPCON  algorithm  would  allow  a  separate 
call  to  evaluate  the  objective  function  and  another  call  to  evaluate  the 
nonlinear  constraints.  This  feature  would  be  useful  in  problems  having 
all  variables  represented  in  the  objective  but  having  sparse  constraint 
functions.  The  sparse  finite  differencing  gradient  procedures  of 
Coleman  and  More  [1982] ,  discussed  in  the  following  section,  would 


provide  more  savings  because  the  method  could  be  applied  to  just  the  set 
of  constraint  functions.  A  single  non-sparse  function  in  the  set  of 
functions  handled  by  the  Coleman  and  More  method  will  result  in  no 


savings  in  function  evaluations.  For  example,  if  two  functions  of 
variables  of  length  two  have  gradients  that  are  structurally  zero  in 
opposite  components,  then  one  call  to  a  function  evaluation  routine  for 
both  functions  is  enough  to  obtain  forward  difference  estimates  for  the 
two  nonzero  components.  If  one  of  the  functions  has  all  nonzero 
gradient  elements,  then  two  calls  will  be  required.  The  Coleman  and 
More  procedure  exploits  these  relationships.  If  one  function  in  a  set 
is  not  sparse,  then  the  number  of  calls  to  the  function  evaluation 
subroutine  will  be  nearly  the  number  of  variables. 


II. 2. 2  FINITE  DIFFERENCING  AND  SPARSE  GRADIENTS 


The  nonzero  gradient  elements  for  all  nonlinear  functions  are 
estimated  from  either  forward  or  central  differencing.  Since  all 
functions  are  evaluated  by  each  call  to  the  function  evaluation 
subroutine,  it  is  worthwhile  to  reduce  the  number  of  calls  of  the 
subroutine  required  to  estimate  all  nonzero  gradient  elements.  If  the 
gradients  of  the  nonlinear  functions  were  not  sparse,  then  n  calls  of 
the  function  evaluation  subroutine  would  be  required,  where  n  is  the 
number  of  variables  in  the  problem.  If,  however,  the  gradients  are 
sparse,  it  is  possible  to  significantly  reduce  the  number  of  calls  as 
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Coleman  and  More  [1982]  have  shown.  Their  algorithm  has  been  employed 
in  the  development  of  OPCON. 

A  single  value  for  the  finite  differencing  interval  that  is  used 
for  all  variables  is  input  to  the  algorithm.  However,  each  restart  of 
the  algorithm  (restarts  occur  after  failures  to  take  a  step)  causes  a 
finite  difference  interval  to  be  computed  for  each  variable.  This  is 
accomplished  by  a  call  for  each  variable  to  the  subroutine,  FDCALC, 
developed  by  Gill,  Murray,  Saunders,  and  Wright  [1981]  for  determining 
good  finite  differencing  intervals.  Their  procedure  balances  truncation 
error  against  the  noise  induced  by  machine  evaluation  of  the  function. 
Since  the  step  direction  chosen  at  each  Iteration  is  the  solution  to  a 
system  of  equations  having  an  estimate  of  the  Lagrangian  gradient  as  the 
right-hand-side,  choosing  the  finite  difference  interval  for  each 
variable  to  make  the  finite  difference  estimate  of  the  Lagrangian 
gradient  reasonably  accurate  is  appropriate.  Thus,  the  Lagrangian 
function  using  the  current  estimate  of  the  multiplier  vector  is  the 
function  passed  to  these  subroutines.  This  is  an  appropriate  choice 
since  it  is  the  estimated  gradient  of  the  Lagrangian  which  is  used  to 
determine  the  step  direction. 

As  mentioned  previously,  it  is  important  to  reduce  the  number  of 
calls  to  the  function  evaluation  subroutine  in  obtaining  the  finite 
difference  estimates  for  the  sparse  gradients.  The  sparse  gradient 
structure  of  the  objective  function  and  nonlinear  constraint  functions 


is  processed  to  obtain  groups  of  variables,  such  that  the  variables 
in  each  group  can  be  varied  together  to  compute  some  of  the  finite 
difference  elements  of  the  Jacobian  matrix  corresponding  to  the 
gradients  of  these  functions.  Only  n^  calls  to  the  function  evaluation 
subroutine  are  required  to  compute  all  the  structurally  nonzero  entries 
in  the  forward  difference  estimate  of  the  sparse  Jacobian  matrix.  If  nG 
is  significantly  smaller  than  n,  then  there  is  a  considerable  savings  in 
the  number  of  computations  required  to  compute  the  estimate.  Since 
central  differencing  is  used  whenever  there  is  an  indication  that 
forward  differencing  may  not  be  sufficiently  accurate,  the  savings  can 
be  even  more  pronounced.  (See  Gill,  Murray,  Saunders,  and  Wright 
[1981],  Stewart  [1967],  and  section  II. 3. 3  for  further  discussion  of 
when  to  switch  to  central  differencing.) 

Linear  constraints  are  handled  separately  from  nonlinear 
constraints.  The  coefficients  are  stored  in  a  sparse  format  and  used  to 
compute  function  values  or  gradients  as  needed.  These  arrays  are  passed 
to  OPCON.  Each  linear  or  nonlinear  constraint  is  set  to  be  either  an 
inequality  (<)  or  equality  constraint.  The  user  also  sets  the  right- 
hand-side  (RHS)  value  of  each  constraint.  An  initial  estimate  of  the 
solution  and  upper  and  lower  bounds  on  the  variables  are  also  passed  to 


OPCON 
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II. 3  IMPLEMENTATION  OF  THE  ALGORITHM 


1.3.1  AN  OVERVIEW 


The  following  paragraphs  describe  in  detail  each  of  the  steps 
performed  by  OPCON  in  roughly  the  order  in  which  they  occur.  A  brief 
overview  of  the  algorithm  is  given  first. 


[0]  Initialize.  Initialize  data  structures  and  compute  function 
values  and  finite  difference  estimates  at  the  initial  estimate  of  the 
solution. 


[1]  Start  or  Restart.  Set  the  approximation  to  the  Cholesky 
factor  of  the  pernuted  Hessian  of  the  Lagrangian  to  the  identity.  Set 
the  multiplier  estimates  to  zero.  If  it  is  a  restart,  then  compute  a 
finite  difference  interval  for  each  variable  using  the  procedure  given 
by  Gill,  Murray,  Saunders,  and  Wright  [1981]. 


[2]  Form  the  Dual  QP.  Form  the  dual  to  the  quadratic  program  (QP) 
solved  on  each  iteration  of  the  SQP  method.  In  forming  the  dual  QP, 
consider  only  those  inequality  constraints  that  are  active. 
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[3]  Solve  the  Dual  QP  Using  a  CG  Method.  Solve  for  the 
multipliers  of  the  original  QP  by  solving  the  dual  QP  that  has  only 
nonnegativity  constraints  on  Some  of  the  variables  using  a  projected 
preconditioned  conjugate  gradient  (PPCG)  method.  If  unable  to  converge, 
then  restart. 

[4]  Compute  the  Step  Direction.  Let  be  the  gradient  of  the 
Lagrangian  function  at  the  current  estimate  of  the  solution  for  the 
primal  variables  and  the  multipliers  from  the  QP  obtained  in  step  [3]  . 
Then  the  step  direction,  sc,  is  the  solution  of  LcLc's  ■  -g^,  where  Lc 
is  the  current  estimate  of  the  Cholesky  factor  of  the  Hessian  of  the 
Lagrangian. 

[5]  Compute  the  Step  Length.  Find  an  acceptable  step  otsc.  The 
criterion  for  an  acceptable  step  is  a  suitable  reduction  in  the  merit 
function  of  Han  (see  equation  1.6)  based  on  a  local  linear  model  of  the 
problem.  Set  the  new  estimate  of  the  solution,  x11,  to  xc  +  asc.  If 
unable  to  find  an  acceptable  step,  then  restart. 

[6]  Check  for  Termination.  Compute  new  estimates  of  the  gradients 
of  the  objective  and  nonlinear  constraint  functions.  Check  for 
termination.  Successful  termination  occurs  if  the  norm  of  the 
Lagrangian  gradient  is  small. 


Co  Che  Cholesky  facCor  of  Che  Hessian  of  Che  Lagrangian  using  Che  BPGS 
formula  as  given  by  Goldfarb  [1976]  buC  ignore  any  change  Co  scrucCural 
zeros  in  L.  If  Che  updaCe  is  unaccepCable,  Chen  resCarc.  OCherwise, 
begin  a  new  iceracion  by  going  Co  sCep  [2]. 

The  following  secCions  discuss  Che  acCual  implemencacion  of  Chese 
seeps  in  deCail. 


II .3 .2  INITIALIZATION 

The  inicializacion  secCion  of  Che  code  reads  in  Che  daca  file 
defining  Che  problem  and  esCablishes  Che  sparse  sCorage  sCrucCures  for 
Che  Hessian  facCorizaCion  and  Che  gradienCs.  The  maximum  number  of 
acClve  conscraincs  is  compuCed  based  on  Che  number  of  variables  in  Che 
problem  and  Che  amounC  of  sCorage  allocaCed  Co  Che  array  Char  will  be 
used  Co  score  daCa  defining  Che  dual  quadraCic  program.  This  array  is 
dimensioned  Co  be  very  large,  buC  ic  is  used  for  scoring  ocher  daca 
during  differenc  seeps  of  Che  algorichm. 

The  objeccive  and  conscrainc  funecions  are  evaluaced  ac  che  inicial 
poinc.  The  degree  of  inf easibillcy  is  compuCed  using  Che  following 


formula: 


(1)  FEAS 


Z  [g,(x~)  -  RHS  ]  +  Z  |h,  (x  ) 

j-1  J  k-mfl 


-  RHS, 


+  Z  max  (0.0,  xT"  -  x,T  ,  x^  -  x“)  , 

i-1  I  I  I  l 


where  the  first  m  constraints  are  inequality  constraints  and  the  last 
rati,  . ..,  ratp  constraints  are  equality  constraints.  The  upper  and  lower 
bounds  on  the  i-th  variable  are  given  by  x“  and  x^,  respectively.  For 
j  -  i,  ...,  ratp,  RHSj  specifies  the  right-hand-side  value  for 
constraint  j. 


The  initial  finite  difference  interval  is  set  to  an  input  value  and 
is  used  for  all  variables.  Forward  differencing  is  used  initially. 


II. 3. 3  FIRST  ITERATION  AND  ALL  RESTARTS 


First  Iteration 


On  the  first  iteration  and  on  all  restarts,  the  Cholesky  factor  is 
set  to  the  identity  and  all  multiplier  estimates  are  set  to  zero.  The 
value  of  MAXLAM,  which  is  used  as  an  upper  bound  on  the  largest 
multiplier  in  absolute  value,  is  also  set  to  zero. 


.  •jSWj.HV.nvV.H-.AiiHVA  A  A  A  A 


Restart 


On  each  restart  the  actions  described  in  the  preceding  section  are 
repeated  and  in  addition  the  following  actions  are  taken.  The  finite 
difference  interval  for  each  variable  is  recomputed  using  the  method  of 
Gill,  Murray,  Saunders,  and  Wright  [1981]  that  optimizes  the  interval 
for  each  variable  in  order  to  make  the  finite  difference  estimates  of 
the  gradients  as  accurate  as  possible.  The  method  also  computes  central 
differencing  intervals.  The  function  used  is  the  Lagrangian  function. 
(During  a  restart,  the  multipliers  are  not  reset  to  zero  until  the 
finite  difference  intervals  are  recomputed.)  The  method  of  Gill, 

Murray,  Saunders,  and  Wright  is  available  as  a  subroutine  called  FDCALC. 

Central  differencing  is  used  if  FDCALC  detects  an  error  condition 
or  if  the  following  inequality  is  satisfied  for  any  variable: 

q  <  10m  hi, 

where  m  is  5  for  this  implementation.  This  test  is  described  in  Stewart 
[1967].  is  an  approximation  to  the  i-th  diagonal  element  of  the 

Hessian  of  the  function  computed  by  FDCALC,  q  is  the  optimum  forward 
differencing  interval  computed  for  the  i-th  variable,  and  q  is  an 
estimate  of  the  i-th  component  of  the  gradient  also  computed  by 
FDCALC.  If  a  gradient  estimate  for  a  variable  is  small  relative  to  the 


Hessian  estimate,  then  central  differencing  should  be  used  because  the 
forward  difference  estimate  is  unlikely  to  be  very  accurate. 


II. 3. 4  FORM  THE  DUAL  QP 


Recall  from  Part  I  that  the  step  direction  computed  at  each 
iteration,  sc,  is  the  solution  to  the  following  quadratic  program: 


min  7f(xc)'s  +  (l/2)s'L  L  ' s 


subject  to  7gj(xc)'s  +  gj(xc)  <  0.0,  j-1,  ...,  m. 


7hk(xc)'s  +  hk(xc)  ■  0.0,  k-m+1,  ...»  nri-p. 


xi  <  xi  +  3i  <  i“l»  •  ••,  n. 


where  Lc  is  the  current  approximation  to  the  Cholesky  factor  of  the 
Hessian  Lagrangian.  By  transforming  to  a  dual  problem  a  much  simpler 
quadratic  program  can  be  solved.  In  order  to  make  the  dimension  of  the 
dual  problem  as  small  as  possible,  only  active  or  nearly  active 
constraints  are  considered.  Equality  constraints  are  always  included. 
Any  upper-  or  lower-  bound  constraint  on  a  variable,  x^,  will  be 
included  if  the  constraint  is  violated  or  if  the  current  value  of  the 
variable  is  within  BNDV*(x^  -  xj)  units  of  either  the  upper  or  lower 
bound.  BND7  is  an  input  parameter  and  is  typically  5x10"*^.  An 
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inequality  constraint  is  considered  active  if  the  constraint  is  violated 
or  if  it  is  within  ACTV  units  of  being  violated,  where  ACTV  is  another 
input  parameter  and  is  chosen  to  fit  the  scaling  of  the  problem.  It  is 
important  to  properly  scale  the  constraints  since  this  criterion  is 
applied  to  all  the  inequality  constraints. 


The  dual  quadratic  program  (see  section  1.3.1)  is 


(2)  min  (l/2)u'M'Mu  +  u'q 


subject  to  Uj  >  0.0, 


where  the  i-th  column  of  M  corresponds  to  an  inequality  constraint.  Let 
the  active  constraints  as  defined  in  the  previous  section  be  numbered 
from  1  to  p' ,  including  any  active  upper-  or  lower-  bound  constraints. 
Then  the  j-th  column  of  M  is  the  solution,  y,  to  Lcy  *  Vgj(xc),  where  gj 
can  be  either  an  equality  or  inequality  constraint.  This  is  a  sparse 
triangular  system  of  equations  that  is  easily  solved  using  a  sparse 
forward  substitution  method.  Vector  q  is  given  by  q  *  M't  -  g(xc)  +  RHS 
where  t  is  the  solution  to  Lct  ■  Vf(xc)  and  g  is  the  vector  of 


active  constraints.  Again,  a  sparse  triangular  system  is  solved  to 


obtain  the  vector. 

As  long  as  M  has  full  column  rank,  which  will  be  the  case  as  long 
as  the  gradients  of  the  active  constraints  are  linearly  independent  and 
Lc  is  nonsingular,  M’M,  the  matrix  defining  the  dual  quadratic  program, 
will  be  positive  definite  and  a  solution  to  (2)  will  exist.  If  the 
gradients  of  the  active  constraints  are  not  linearly  independent,  which 
will  be  the  case  if  there  are  more  active  constraints  than  variables  in 
the  nonlinear  program,  then  (2)  cannot  be  solved  with  the  CG  method 
described  in  the  next  section.  The  algorithm  recognizes  when  there  are 
more  constraints  than  variables  and  takes  the  following  action.  A  value 
of  e  is  added  to  each  of  the  diagonal  elements  of  M'M,  where  £  is  a 
small  (10”^)  positive  number  input  to  0PC0N.  This  is  a  perturbation  to 
the  original  problem,  which  allows  the  algorithm  to  continue  making 
progress  when  far  from  a  solution  (see  section  1.3.1).  It  is  expected 
that  the  number  of  active  constraints  near  the  solution  will  be  equal  to 
or  less  than  the  number  of  variables. 

II. 3. 5  SOLVE  THE  DUAL  QP  USING  THE  PPCG  METHOD 

The  basic  method  for  finding  the  solution  to  the  dual  QP  using  a 
projected  preconditioned  conjugate  gradient  (PPCG)  method  is  given  in 
section  1.3.3.  The  computer  implementation  follows  the  steps  given  in 
the  description,  with  a  few  exceptions. 
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The  discussion  here  refers  to  the  following  quadratic  program: 


(3)  min  (l/2)u'Ru  +  k'u 


subject  to  u^  >  0.0  for  i  e  IjNEQ 


where  Ijjjeq  is  the  set  of  indices  corresponding  to  the  inequality 
constraints  considered  active  in  (1.1)  and  K  ■  M'M  or  K  *  M'M  +  el  and 
k  ■  q. 


The  exceptions  include  using  an  error  tolerance  to  determine 
whether  the  system  is  ill-conditioned  or  nearly  singular,  passing 
through  the  algorithm  the  first  time  using  a  weak  optimality  criterion 
followed  by  a  restart  with  the  required  criterion  on  the  norm  of  the 
residual,  and  scaling  the  K  matrix  to  speed  convergence. 


Checking  for  Ill-Conditioning  in  K 


The  value  of  Uj  ’KJJUJ  is  computed  during  each  conjugate  gradient 
iteration  where  the  subscript,  J,  refers  to  restricting  attention  to  a 
subset  of  the  variables  in  (3).  If  Uj’KjjUj/uj'uj  is  small,  then  K  is 
nearly  singular  or  ill-conditioned.  If  this  condition  is  detected,  a 
flag  is  set  and  an  attempt  is  made  to  solve  the  perturbed  problem  to  be 
described  shortly. 


$ 


M  '.T .T  .V  .T I  »•'  '.V*  7C  ,". 


V.». \  .\«.7‘. V. -M.i. V- * 


1/  u  \J,* 


*  V  '  r  ■  .  ■  j  *  •  ■  ;  ^  -  .  ■  1 


Scaling  the  Dual  QP  Matrix 

Before  the  PFCG  algorithm  is  initiated,  the  matrix  K  is  scaled  as 
Kg  CAL  *  SKS,  w^ere  S  is  a  diagonal  matrix  with  ■  C  |K.i  1 
problem  actually  solved  is 

min  (1/2)  v,kscalv  +  v'Sd 
v 

subject  to  >  0.0  for  i-  e  Ijjjeq* 

The  solution,  u*,  is  given  by  u*  ■  Sv*,  where  v*  is  the  solution  to  the 
scaled  problem.  Tests  on  the  residual  norm  are  always  made  relative  to 
the  unsealed  problem  to  ensure  that  the  error  tolerance  is  satisfied  for 
the  desired  problem.  The  initial  estimate  of  the  solution  to  the  scaled 
problem  is  set  to  v°  ■  S-^  u°,  where  u°  is  the  last  estimate  of  the 
multiplier  vector  corresponding  to  the  currently  active  constraints. 

Perturbing  the  Dual  QP 

If  the  algorithm  is  unable  to  converge  after  an  input  number  of 
iterations  or  if  K  appears  to  be  singular  or  ill-conditioned,  a  small 
value,  e,  is  added  to  each  of  the  diagonal  elements  of  KgQ^  and  the 
algorithm  is  restarted  with  the  multipliers  reinitialized  to  the  zero 
vector.  The  small  value  added  is  an  input  parameter  and  is  usually  on 
the  order  of  10“^.  This  procedure  has  an  effect  similar  to  that 
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discussed  in  the  steps  taken  to  form  the  dual  QP  when  there  are  more 
active  constraints  than  variables. 

If  the  CG  algorithm  fails  to  converge,  OPCON  will  restart. 

II. 3. 6  COM>UTE  THE  STEP  DIRECTION 

After  computing  the  new  estimate  of  the  multipliers  the  new  step 
direction,  sc,  is  obtained  as  the  solution  to  the  equation: 

m  mfp  n 

L  L  '  s  -  -{Vf(xc)  +  £  u.«Vg  .(xc)  +  £  v.Vhk(xc)  +  £  ±  w^e,}  . 

C  C  j-l  3  3  k-nri-1  k  ‘  i-1 

Any  multipliers  associated  with  inequality  constraints  considered  to  be 
inactive  are  set  to  zero.  The  multipliers  for  an  active  upper-  or 
lower-bound  constraint  on  the  variable  x^  w^,  will  be  positive  if  the 
upper  bound  is  active  and  negative  if  the  lower  bound  is  active.  This 
sparse  set  of  equations  is  solved  using  the  sparse  code  of  George  and 
Liu  [1981], 


II. 3. 6  COMPUTE  THE  STEP  LENGTH 


OPCON  will  not  accept  a  step  unless  the  step  results  in  a 
sufficient  decrease  in  the  Han  merit  function  (see  section  1.4).  Let 
FEASO  be  the  sum  of  infeasibilities  as  given  by  (1)  at  the  current 
estimate  of  the  solution,  xc.  The  current  value  of  the  merit  function 
is  then 


r-.v. 


PHIO  -  f(xc)  +  MAXLAM  •  FEASO  , 

where  MAXLAM  is  an  upper  bound  on  the  maximum  of  the  absolute  values  of 
the  multipliers.  For  0  <  a  <  1,  evaluate  (1)  at  xc  +  asc  and  define  the 
value  as  FEASN.  The  step,  asc,  will  be  accepted  if 

PHIN  -  f(xc  +  osc)  +  MAXLAM  •  FEASN  <  PHIO  +  era  •  PHISLP. 

The  test  is  made  less  strict  by  setting  0.01  <  a  <  0.5.  The  slope  of 
the  merit  function  is  approximated  by  PHISLP,  which  is  computed  as 
follows : 


-V 

$ 

& 


m  nrt-p  n 

PHISLP  -  f(xc)’sc  +  MAXLAM  {  E  G  (xC)  +  E  R  (xC)  +  £  S  }  , 

j-1  J  k-m+1  i-1  1 


where  for  inequality  constraints  (see  section  1.4) 


>• 

p 
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10:  gj(xc)<RHS j  or  [g  .j(xc)-RHS.j  and  7gj(xc)'sc  <  0.0] 

7gj(xc)'sc:  otherwise 

for  equality  constraints 

-7hk(xc)'sc:  hk(xc)<RHSk 
7hk(xc)'sc:  hk(xc)>RHSk 
|7hk(xc),sc|  :  h^x^-RH^ 

and  for  upper-  and  lower-bound  constraints 

s£:  x![  >  x£  or  [x£  -  x“  and  s£  >  0.0] 

-s£:  x£  <  xj  or  [x£  -  xj  and  s£  <  0.0]. 

The  first  trial  value  of  a  is  1  .  If  this  is  not  an  acceptable 
step,  then  a  is  reduced  by  a  constant  factor  less  than  one.  This 
procedure  is  repeated  until  an  acceptable  step  is  found  or  until  the 

number  of  trial  values  exceeds  an  input  value,  usually  8  to  16.  If  an 

acceptable  step  is  not  found  in  the  allotted  number  of  iterations,  the 
algorithm  will  check  the  smallest  value  of  the  merit  function  obtained 
during  any  of  the  iterations.  If  this  value  is  smaller  than  the  value 
of  the  merit  function  at  the  current  estimate,  the  algorithm  will  take 


the  step  corresponding  to  the  smallest  value,  then  the  algorithm  will 
restart.  No  step  is  taken  that  does  not  improve  the  merit  function. 

11. 3. 7  CHECK  FOR  TERMINATION 

After  a  successful  step  has  been  taken,  an  approximation  to  the 
gradient  of  the  Lagrangian  is  computed  using  the  current  estimates  of 
the  multipliers  and  new  finite  difference  estimates  of  the  gradients  of 
the  objective  and  constraint  functions.  If 

|VxJl(xn,  u11,  vn)|/max{  1,  |f(xn)|> 

is  less  than  the  tolerance  specified  for  stopping,  the  algorithm  will 
terminate. 

The  algorithm  will  also  stop  if  the  step  taken  has  norm  less  than  a 

•  Q 

specified  value,  usually  10  .  The  algorithm  will  terminate,  indicating 

a  failure  if,  the  iteration  after  a  restart  results  in  a  failure  since 
another  restart  would  result  in  the  same  failure. 

11. 3. 8  UPDATE  THE  APPROXIMATION  TO  THE  HESSIAN  OF  THE  LAGRANGIAN 

The  updating  of  the  approximation  to  the  Hessian  of  the  Lagrangian 
function  follows  the  procedure  described  in  section  1.5,  with  the 
exceptions  noted  below. 


II. 4  PERFORMANCE  OF  OPCON 


II. 4.1  TEST  PROBLEMS 


It  is  difficult  to  find  in  the  mathematical  programming  literature 
large-scale  nonlinear  programming  problems  suitable  for  use  as  test 
problems.  Yet  the  performance  of  a  large-scale  algorithm  on  small,  but 
well-known,  test  problems  is  an  important  part  of  evaluating  the 
algorithm.  This  section  describes  the  performance  of  the  OPCON 
algorithm  when  solving  eleven  test  problems,  seven  of  which  appear  in 
the  literature.  The  other  four  were  either  created  by  the  author  or 
obtained  from  unpublished  sources.  Of  these  latter  four  problems,  one 
has  32  variables  and  the  other  three  each  have  60  variables.  They  are 
considered  helpful  in  assessing  the  performance  of  a  large-scale 
nonlinear  programming  algorithm. 

Two  problems  were  obtained  by  adding  a  set  of  ten  nonlinear  and 
five  linear  constraints  to  two  well-known  unconstrained  optimization 
problems  (Buckley  and  Lenir  [1983]).  These  two  problems  are  highly 
nonlinear  and  non-convex.  The  interiors  of  several  degenerate 
ellipsoids  are  excluded  from  the  feasible  region  by  some  of  the 


constraints.  Since  the  unconstrained  minima  are  contained  in  the 
excluded  region,  the  problems  are  strongly  nonconvex.  Since  these 
problems  were  created  for  this  paper,  we  are  unable  to  compare  their 
solutions  with  any  solutions  obtained  with  other  algorithms. 

Also  included  in  the  set  of  test  problems  is  a  weapon-allocation 
problem  having  a  single  linear  constraint  and  non-negativity  constraints 
on  the  variables;  an  economic  model  of  OPEC  oil  prices  that  has  10 
nonlinear  constraints  and  40  linear  constraints.  Each  of  the  eleven 
test  problems  is  described  in  Appendix  A. 

II. 4 .2  TEST  RESULTS 

The  problems  were  run  on  a  VAX  11/780  minicomputer  using  double 
precision  for  all  noninteger  computations.  The  compiler  option  that 
stores  double  precision  numbers  in  a  format  allowing  a  dynamic  range  of 
10~307  tQ  j^q+307  was  seiected.  Table  1  summarizes  the  test  results. 
The  tolerance  for  successful  termination  was  set  to  10-^  for  all 
problems.  The  CPU  time  (in  seconds)  is  the  execution  time  of  the 
program  for  each  problem  and  does  not  include  compilation  or  linking 
time.  The  total  number  of  function  evaluations  and  CG  iterations  are 
given.  The  number  of  each  type  of  constraint  —  nonlinear,  linear,  and 
upper-  or  lower-bound  —  is  given,  followed  by  the  number  considered 
active  at  the  time  the  algorithm  terminated.  The  norm  of  the  Lagrangian 
gradient  and  the  sum  of  infeasibilities  are  also  given.  The  number  of 
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times  the  CG  algorithm  failed  to  converge  and  the  number  of  times  an 
improved  value  of  the  merit  function  could  not  be  found  are  also  given 
for  each  problem.  The  total- number  of  restarts  is  also  shown. 

All  eleven  problems  terminated  successfully  for  the  termination 
tolerance  given  above.  Even  though  the  OPCON  algorithm  is  not 
specifically  designed  to  solve  problems  having  linear  constraints,  it 
did  perform  well  on  the  oil  price  model  and  tolerably  well  on  the 
weapon-allocation  model. 

As  noted  earlier,  one  of  the  problems  that  must  be  handled  by  the 

algorithm  is  the  possibility  of  M'M  being  singular  or  nearly  so  due  to 

having  more  active  constraints  than  variables  or  gradients  of  active 

constraints  that  are  nearly  linearly  dependent.  The  algorithm  handles 

this  problem  quite  well.  The  eigenvalues  of  M'M  were  computed  for  each 

iteration  for  all  of  the  test  problems.  Condition  numbers  ranged  up  to 

g 

10  .  The  ill-conditioning  seldom  caused  the  CG  algorithm  to  fail.  The 
accuracy  of  the  multiplier  vector  obtained  under  such  circumstances  is 
questionable;  however,  the  algorithm  almost  never  failed  to  take  a  step 
after  solving  these  ill-conditioned  problems. 

It  is  interesting  to  note  that  the  average  number  of  CG  iterations 
per  main  Iteration  for  each  problem  is  only  slightly  more  than  the 
number  of  active  constraints  (the  order  of  the  dual  QP  solved  by  each 
call  to  the  CG  algorithm)  indicating  the  efficiency  of  solving  the  dual 


and  starting  with  the  last  estimate  of  the  dual  variables*  The  number 


of  function  evaluations  seems  high  until  one  considers  that  finite 
differencing  was  used  to  compute  all  gradients.  In  some  cases  central 
differencing  was  used,  which  increases  the  number  of  function 
evaluations . 

II. 4. 3  PARAMETER  VALUES 


In  the  process  of  obtaining  these  test  results,  it  was  found  that 
the  values  of  several  of  the  input  parameters  are  particularly  critical 
to  the  successful  termination  of  the  algorithm.  The  choice  of  0.1  for 
the  step-length  parameter,  a,  was  found  to  be  quite  good  for  most 
problems.  Smaller  values  sometimes  allowed  the  algorithm  to  drift, 
while  larger  values  tended  to  cause  more  step-length  iteration  failures 
and  hence  considerable  more  computational  effort  since  each  failure 
causes  a  restart. 

The  convergence  criterion  for  the  norm  of  the  residual  in  the  PPCG 

—1  2 

algorithm  was  normally  set  to  10  .  Larger  values  produced  estimates 

of  the  multiplier  vector  that  were  sometimes  not  accurate  enough  to 
obtain  a  good  step  direction,  whereas  smaller  values  caused  more  CG 
Iteration  failures  with  no  compensating  improvement  in  performance.  The 
relaxation  parameter  for  the  SSOR  preconditioning  step  was  set  to  1.3. 
This  value  gave  good  performance  while  larger  values  typically  took  more 
iterations  of  the  CG  algorithm  to  converge. 
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Best  choices  for  the  parameters,  BNDV  and  ACTV,  used  to  determine 


whether  an  upper-  or  lower-bound  constraint  or  a  general  inequality 
constraint  should  be  considered  active  are  very  much  problem 
dependent.  A  choice  of  1.0  for  ACTV  and  0.0005  for  BNDV  was  normally 
acceptable,  but  for  some  problems  different  values  were  used.  For 
example,  ACTV  was  set  to  0.05  for  the  Proctor-Gamble  problem  and  to  0.5 
for  the  Hexagon  problem.  The  BNDV  parameter  was  set  to  0.005  for  the 
weapon-allocation  problem,  to  allow  it  to  pick  up  zero  allocations  more 
quickly.  If  BNDV  or  ACTV  is  set  to  too  large  a  value,  constraints  not 
active  at  the  solution  may  continue  to  be  considered  active  when  the 
algorithm  gets  close  to  a  solution.  In  this  case,  it  is  possible  that 
more  constraints  will  be  active  than  there  are  variables  in  the  problem, 
and  the  need  to  solve  the  perturbed  problem  near  the  solution  may  hinder 
convergence. 

II. 4. 4  FULL  HESSIANS  VERSUS  SPARSE  HESSIANS 

One  issue  of  major  concern  during  the  development  of  this  algorithm 
was  the  degree  to  which  ignoring  fill-in  in  the  BFGS  update  would 
degrade  the  performance  of  the  algorithm.  Table  2  shows  the  results  of 
comparing  full  and  sparse  Hessian  matrices  on  a  subset  of  the  test 
problems.  Fill-in  was  not  ignored  for  the  full  Hessian  runs.  The  table 
shows  the  degree  of  sparseness  tot  each  problem.  The  termination 
criterion  was  set  to  10“^,  to  force  a  more  stringent  comparison. 
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The  value  of  this  comparison  Is  limited,  considering  the  size  of 
the  test  problems  used.  The  results  suggest,  however,  that  the 
practical  penalty  for  ignoring  fill-in  is  minimal. 

II. 4. 5  COMPARING  OPCON  TO  MINOS  5.0 

MINOS  5.0  is  a  well-known  implementation  of  the  projected 
Lagrangian  algorithm  developed  by  Murtagh  and  Saunders  [1982] .  The 
interested  reader  should  consult  the  user's  guide  for  MINOS  5.0  (Murtagh 
and  Saunders  [1983])  to  obtain  a  full  description  of  the  features  of  the 
code.  Performance  of  OPCON  and  MINOS  5.0  on  several  of  the  test 
problems  is  summarized  in  table  3.  (Timing  data  for  MINOS  was  obtained 
from  the  same  VAX  11/780  system  described  earlier.) 

MINOS  is  clearly  superior  to  the  current  version  of  OPCON  for 
problems  having  nearly  linear  constraints,  such  as  the  weapon-allocation 
problem  and  the  World  Bank  problem.  For  problems  that  are  highly 
nonlinear,  especially  in  the  constraints,  OPCON  is  as  good  as  MINOS  and 
often  much  better.  MINOS,  for  example,  was  unable  to  achieve  any 
significant  progress  on  the  modified  Powell  singular  function  problem, 
whereas  OPCON  manages  to  find  a  feasible  solution  having  a  much  improved 
objective  function  value.  The  Colville  no.  2  problem  has  considerable 
nonlinearity  in  the  objective  function  and  constraints,  but  ten  of  its 
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TABLE  3.  OPCON  VS.  MINOS  COMPARISON 


OPCON 


MINOS 


Problem  name 

Function 

evaluations 

CPU 

time  (sec) 

Function 

evaluations 

CPU 

time  (sec) 

US  Steel  No.  1 

21 

1 

59 

3 

US  Steel  No.  4 

236 

4 

181^^ 

5 

Hexagon 

114 

5 

287 

7 

Wong  No.  2 

179 

3 

1,836 

23 

Dembo  No.  1 

570 

16 

3,945 

1,711 

Colville  No.  2 

1,130 

52 

678 

15 

Weapon  Allocation 

3,993 

286 

1,398 

24 

World  Bank 

248 

83 

1,214 

27 

Mod.  Powell  Sing. 

10,565 

596 

v-/ 

^  '  MINOS  found  a  worse  solution  than  the  one  found  by  OPCON. 
^  '  MINOS  was  unable  to  solve  this  problem. 


II. 5  CONCLUSIONS 


Based  on  the  results  given  In  the  preceding  section,  the  OPCON 
algorithm  shows  promise  of  being  a  practical  tool  for  solving  large- 
scale  nonlinear  programs.  Obviously,  problems  with  60  variables  are  not 
large;  however,  the  scarcity  of  problems  in  the  literature  having  even 
half  as  many  variables  would  indicate  that  these  results  are 
significant.  Also,  some  of  these  test  problems  —  i.e.,  the  U.S.  Steel, 
Colville,  and  Dembo  problems  —  are  very  difficult  to  solve  in  spite  of 
their  smallness.  It  is  hoped  that  the  algorithm  will  soon  be  applied  to 
larger  problems  that  would  allow  a  more  realistic  evaluation  of  its 
ability  to  solve  large,  sparse  nonlinear  programs. 

Because  it  uses  an  active  set  strategy  and  solves  a  dual  problem, 
the  algorithm  should  be  able  to  deal  with  large  numbers  of  nonlinear  or 
linear  inequality  constraints  since  the  size  of  the  dual  problem  is 
determined  by  the  number  of  active  constraints.  The  current  version  of 
the  OPCON  algorithm  stores  the  M  matrix  —  which  has  n  rows  and  p' 
columns,  where  n  is  the  number  of  variables  in  the  problem  and  p'  is  the 
number  of  active  contraints  —  in  a  dense  format.  This  array  is  the 
limiting  factor  on  size.  It  may  be  possible  to  store  this  array  in  a 


sparse  format  (see  section  II. 3 .4). 

The  coded  version  of  the  algorithm  described  here  Is  not  as 
efficient  as  it  could  be.  The  calculation  of  the  eigenvalues  of  the  M'M 
matrix  is,  for  instance,  unnecessary.  Also,  the  function  evaluation 
routine  evaluates  the  objective  and  all  nonlinear  constraint  functions 
on  each  call.  As  discussed  in  section  II. 2,  a  more  efficient  version  of 
the  code  would  split  the  objective  function  evaluation  off  from  the 
nonlinear  constraint  function  evaluations.  This  would  allow  a  more 
efficient  use  of  the  technique  of  Coleman  and  More  [1982]  for  reducing 
the  number  of  function  evaluations  required  to  obtain  finite  difference 
estimates  of  sparse  gradients. 

Several  questions  remain  unanswered.  How  well  the  algorithm  will 
work  for  solving  really  large,  practical  problems  is  probably  the  most 
interesting  of  these.  Other  questions  of  interest  include  determining 
the  effects  of  the  errors  in  the  estimates  of  the  gradients  and  the 
deviation  from  the  BFGS  update  when  fill-in  in  the  Cholesky  factor  is 
ignored  on  the  performance  of  the  algorithm  on  large-scale  problems.  It 
would  be  comforting  to  know  what  conditions  are  required  to  guarantee 
the  convergence  of  the  projected  CG  algorithm  described  earlier. 

The  step-length  control  procedure  is  simple,  and  it  may  be  possible 
to  improve  the  performance  of  the  algorithm  by  improving  this 


procedure.  For  instance,  an  adaptive  procedure  that  would  allow  step- 
length  parameter,  a,  to  be  greater  than  1  could  produce  better  results 
on  problems  having  singular  or  nearly  singular  Hessians. 

The  development  and  testing  of  an  extension  of  the  sequential 
quadratic  programming  algorithm  for  solving  large,  sparse  nonlinear 
programs  has  been  presented  here  .  The  test  results  indicate  that  the 
algorithm  has  the  potential  to  be  a  practical  tool  for  solving  problems 
having  highly  nonlinear  objective  and  constraint  functions. 
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The  test  problems  used  to  evaluate  the  nonlinear  optimization  code 
developed  for  this  paper  are  described  In  this  appendix.  Mathematical 
formulations  are  given  for  the  problems  having  a  small  number  of 
variables.  A  FORTRAN  listing  for  the  coded  problem  Is  Included  for  each 
problem.  Termination  data  reported  for  each  problem  Includes  the 
objective  function  value,  f(x*),  the  norm  of  the  gradient  of  the 
Lagrangian,  e(x*),  and  the  sum  of  the  infeasibilities,  r(x*).  The 
starting  points  for  each  problem,  the  value  of  the  objective  function  at 
the  starting  point,  and  the  sum  of  the  infeasibilities  at  the  start  are 
also  reported.  The  best  reported  result  by  any  other  algorithm  is  also 
included . 


[1]  Betts  Problem 


Reference:  Hock  and  Schittkowski  [1981] ,  problem  no.  53 

min  f(x)  ■  (x^  -  X2^  +  (X2  +  x^  -  2)^  +  (x4  -  1)^  +  (xj 


subject  to 

X1  +  3x2  “  °» 
x3  +  x4  -  2x5  -  0, 

x2  ”  x5  “  0* 

Starting 

point:  x°  ■ 

(7,  2,  6,  1,  2)', 

f(x°) 

-  62.0, 

r(x°) 

■  16.0. 

Results: 

Reported 

0PC0N 

results 

f(x*) 

4.093023256 

4.093023269 

e(x*) 

2.8E-10 

2.17E-04 

r(x*) 

0 

4.83E-15 

O  II 


SUBROUTINE  FNCVAl<X,F> 

SETT’S  PROBLEM. 

DOUBLE  PRECISION  X Cl) , F< 1 ) , ONE , TWO 
DATA  ONE, TVO/ 1.000,2.000/ 

? : I > • <  X  < n-X(2> )**2+<X(2)+X< 3)  -TVO) **2* ( X( 0  > -ONE) **2 
1  ♦ (1(5) -ONE) »*2 

C 

RETURN 
END 


[2]  Proctor  and  Gamble  Problem 


Reference:  Himmelblau  [1972]  problem  no.  11 


min  f (x)  -  5.3578547  x-j2  +  0.8356891  x^  +  37.293239  xL 


subject  to 


0  <  85.334407  +  0.0056858  x2x5  +  0.0006262  xtx4 
-0.0022053  x3  x5  <  92 

90  <  80.51249  +  0.0071317  x^  +  0.0029955  x^ 
+  0.0021813  x32  <  110 

20  <  9.300961  +  0.0047026  x3x5  +0.0012547  XjX3 
+  0.0019085  X3X4  <  25 


78  <  xL  <  102 
33  <  x2  <  45 
27  <  x3  <  45 
27  <  x4  <  45 
27  <  x3  <  45 

Starting  point:  x°  -  (78.62,  33.44,  31.07,  44.18,  35.22)', 
f(x°)  -  10418.2 
r(x°)  -  0.0. 


A- 5 


Results 


Reported 

OPCON 

results 

f(x*) 

10126.60285 

10126.64100 

e(x*) 

4.05E-4 

Not  reported 

r(x*) 

1.60E-11 

Not  reported 

3U3R0UTIN2  FNCVAL(Z.F) 

PROCTOR-GAMBLE  CO.  -  HIMMELBLAU  PROBLEM  NO.  U. 

DOUBLE  PRECISION  It  1),F(  t)  ,AU)  ,1(  4)  ,C(4)  ,D(  4) 

DATA  A/ 3. 3378347,0. 835489,37.293239/ 

DATA  B/ 83. 334 407, 3. 48380-3, 8. 2420-4,-2. 20 330-3/ 

DATA  C/80.31249,7. 13170-3,2. 99330-3,2. 18130-3/ 

DATA  0/9 .30094,4.70240-3,1.25470-3,1. 90830-3/ 

rtl)«A<l)*Z(3)**2  +A(2)*Z< 1)*Z(3)  ♦A<3)*Z(1) 

F(2)-B<1)+B<2)*Z<2)*Z(5)+B(3>*Z(1)*Z(4)+8<4>*Z<3)*Z(5> 

F(3)*-F(2) 

Pt4)»C<t)+C<2)*Z(2),Z<3)+C(3)*Z(l)*Z<2)+C<4)*Z<3),t2 
F ( 5 ) »-F ( 4 ) 

?(4)«0<1)>D(2)*Z(3)*Z(3)+D<3)*Z< 1 ) *Z ( 3 > +0 < 4 ) *Z < 3 ) *Z ( 4 ) 
F<7)—F<4) 

RETURN 

END 


/ 
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[3]  U.S.  Steel  Problems 


References:  Himmelblau  [1972] t  problem  no.  22;  Hock  and  Schittkowski 


[1981],  problem  nos.  95-98. 


min  f(x)  -  Z  c,  x. 
i-1 


subject  to 


S  allXi  +  blXlx3  +  b2x3x5  +  b3x4x5  +  b3x4x5  +  b4x4x6 
1-1 


+  b5x5x6  < 


Z  a12*i  +  b6xlX3  +  b?x4x5  +  b8x4x6  +  b9x5x6  <  B2 


Z  al3xi  +  b1Qx4x5  <  B3 


Z  al4xt  +  buxlX6  <  B4 


The  [a..]  and  [b.]  coefficients  are  given  in  the  listing.  Four  problems 


are  defined  by  the  [B.]: 


Problem 


B1 

b2 

b3 

B4 

-4.97 

+1.88 

+29 .08 

+78.02 

-4.97 

+1.88 

+69.08 

+118.02 

-32.97 

-25.12 

+29.08 

+78.02 

-32.97 

-25.12 

+124.08 

+173.02 

<  x^  < 

0.31 

0  <  x4 

< 

0.042 

<  *2  * 

0.046 

0  <  Xg 

< 

0.028 

<  Xj  < 

0.068 

0  <  x6 

< 

0.0134 

All  four  problems  were  solved  by  scaling  Che  variables  as  follows: 
Scale  x^  by  multiplying  by  10;  scale  the  other  five  variables  by 
multiplying  by  100. 

Starting  points: 

For  all  problems  x°  »  0.0. 

For  problem  2  f(x°)  -  0.0  and  r(x°)  -  4.97. 

For  problem  4  f(x°)  -  0.0  and  r(x°)  -  58.09. 

Results: 

Problems  1  and  2  have  the  same  answer  as  do  problems  3  and  4; 
therefore,  only  two  sets  of  results  are  given. 


Reported 


OPCON 

results 

Problems  1-2 

f(x*) 

0.01561952525 

0.015619514 

e(x*) 

5.80E-05 

0 

r(x*) 

6.17E-16 

2. IE- 09 

Problems  3-4 

f(x*) 

3.135809123 

3.1358091 

e(x*) 

3.31E-09 

0 

r(x*) 

1.89E-14 

0 
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SUBROUTINE  FNCVALCX.F) 


'J3  37EEL  PROBLEM  -  HIMMELBLAU  NO.  22. 

SCALED  VERSION. 

DOUBLE  PRECISION  1(11,1(1) 

INTEGER  I ,NVAR 

DOUBLE  PRECISION  A< 4 , 4 ) ,8(1 1 ) ,C C 4 > .DDOT. SCALE ( 4 ) 

DATA  NVAR/ 4/ 

DATA  SCALE/ 1.001,3*1.002/ 

DATA  C/4 . 3D0 ,3.1001,4.3 3D 1,1 .3801 ,4.1501,4. 700/ 

DATA  A/-17. 100,-38.200,-2. 0402,-2. 1 2302, >4. 23402,-1 . 473503, 

♦  -17.700,-34.800,-1.13702,-1.47702,-3.37802,-1.383203, 

+  -  0.00,2.7302,0.00,7.001,8.1702,0.00,-1.37702,3.1102, 

♦  0.00,-3.8702,-3.7102,-2.17803/ 

DATA  8/ 1 . 47202 , 3 . 3803 ,3.8103,1.8304,2. 4304 ,1.3702,2.4503, 

♦  1.4404,1.7204,-2.404,1.404/ 

DO  10  I«1,NVAR 

XII) >X (I) /SCALE! I ) 

10  CONTINUE 

i 

f< 1)«ODOT(NVAR,C, l,X,l) 

F(2)«DD0T<NVAR,A<1 ,1),1 ,X,  l  >  *B C  l  >  *X  <  1  >  *X < 3  > +B < 2  >  *X  <  3  >  *X  <  3 >  ♦ 

1  B<3)*X<4)*X(5)+8<4)*X(4)*X(4)+B(5)*X<5)*X<4) 

F<3)-DD0T<NVAH,A<1,2) , 1 ,X , l > ♦! ( 4 ) *X ( l ) *X < 3) *B < 7 ) *X ( 4 ) *X ( 3 ) ♦ 

1  B(8)*X(4)*X(4)+8<7)*X<5)*X(4) 

F  <  4 ) aDDOTCNVAR , AC  1 ,3) ,1 «X,l)+i(10)*XC4)*X(5) 

PCS ) aDDOTCNVAR , AC 1,4),1,X,1)+BC11)*XC1)*XC4) 

DO  20  I*1,NVAR 

X<  I ) aSCALEC I ) *X( I > 

20  CONTINUE 

RETURN 

END 


A-ll 


References:  Himmelblau  [1972],  problem  no.  16;  Hock  and  Schlttkowskl 
[1981],  problem  no.  108. 


min  f(x)  ■  ”xix4  +  x2x3  “  X3X4  +  X5X4  “  X5X8  +  X6X7 

subject  to 

x 2 2  +  x42  <  1 

x52  +  x62  <  1 

X^2  +  (X2  “  Xg)2  <1 

(xx  -  x5)2  +  (x2  -  x6)2  <  1 

(XL  -  x7)2  +  (x2  -  Xg)2  <  1 

(x3  -  Xg)2  +  (x4  ~  X6)2  <  1 

(x3  -  Xy ) 2  +  (xA  -  Xg)2  <  1 

x7 2  +  (Xg  -  Xg)2  <  1 

x2x3  -  X^X^  <  0 

-XgXg  <  0 

XgXg  <  0 

X6X7  ■  X5X8  <  0 

0  <  Xg  <  1. 

Starting  point:  x0^  -  1.0,  i  -  1,  ...,  8,  Xg°  *  .9, 
f(x°)  -  0.0, 
r(x°)  - 


«•_  V.  V-V  -Jf-  if-  A 


2 


Results 


f(x*) 

e(x*) 


Repotted 

OPCON  result 


-1.732050808  -1.732050808 

2.63E-08  3.9E-10 


r(x<0 


2.37E-13 


3.3E-12 


SUBROUTINE  FNCVALU.F) 


HIMNEL3LAU  PROBLEM  NO.  U  -  HiZACON. 
DOUBLE  PRECISION  X<1>.F(1> 

F( l)»-F< 1 ) 

FC2)aX(3)**2+X(4)**2 
F(3)bX(3>**2+X(*>**2 
?(4>aXU)**2*(Z<2)-X<9)  >**2 
F(5>«<X(l)-X<5))**2^(X(2)-X<4) >»*2 
FU)«<X<  1)-XC7)  >**2*<X<2>-X<3>  >**2 
F(7)a(X<3)-X<3) )**2*<X<4)-X<<) )**2 
FC#)«<X< 3>-I<  7)  )**2+<XM)-X<l>  >»*2 
F(»)«X(7)**2+<X<3)-X<»> )**2 
FU0)-X(2)*X(3)-X(1)*X(4) 

rU2).X(5)«X(» 

FU3)«X(4)*X(7)-X(3)*X(8) 

RETURN 

END 
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[5]  Wong  Problem  No.  2 

Reference:  Hock  and  Schittkowski  [1981],  problem  no.  113. 

min  f(x)  -  xt2  +  x22  +  x^x2  -  14x^  -  16x2  +  (x3  -10)2  +  4(x4 
+  (x5  -  3)2  +  2(x6  -  l)2  +  5x72  +  7(x8  -  ll)2 
+  2(xg  -  10)2  +  (x1Q  -  7)2  +  45.0 

subject  to 


3(xx  -  2)2  +  4(x2  -  3)2  +  2x32  -  7x4  <  120 
5x^2  +  8x2  +  (x3  -  6)2  -  2x4  <  40 
0.5(x1  -  8)2  +  2(x2  -  4)2  +  3x52  -  xg  <  30 
x^2  +  2(x2  -  2)2  +  2x^x2  +  14xj  -  6xg  <  0 
-3x^  +6x2  +  12(xg  -  8)2  -  7x10  <  0 
4x^  +  5x2  -  3x7  +  9xg  <  105 
10x^  -  8x2  -  17x7  +  2xg  <  0 
~8x^  +  2x2  +  5x^  -  2xj^q  <  12. 


Starting  point:  x°^  ■  0.0,  1*1,  ...,  10 
f(x°)  -  1352.0 
r(x°)  -  810.0. 


•>..>>  Vj>  .>A'> > 


Results 


OPCON 

f(x*)  24.30620907 

e(x*)  1.40E-05 


Reported 

result 

24.3062091 

1.2E-09 


rfx*} 


7.70E-10 


4.6E-10 


3UBR0U7XNE  FNCVAL<X,F) 

C 

DOUBLE  PRECISION  X<1).F<1> 

C 

C  PROBLEM :  VONC  NO.  2;  FROM  K  AND  3  #113  P.  122. 

f 

DOUBLE  PRECISION  TEMP 
C 

TEMP»X< 1)**2+X<2)**2+X(1)*X<2)-14.QD0*X(1)~1.4D1*X(2)+ 
t  <X<3)-1.001)**2+4.  OOO*  (X(  4)-3.0D0)t*2*<X(5)-3.ODO}>*2+ 

*  2.0DO*<X(4)-1.QDO)**2+S.ODO*X(7)**2*7.0DO*<X(I)-1 .  1D1  >**2* 

«  2.0DO*(X(?)-1.0D1)**2+<X(10)-7.0DO)**2+4.5D1 

F  < 1) -TEMP 

TEMP-3 -0D0  *( X( l)-2.0DO)**2+4.0DO*(X(2)-3.0D0)**2+2 . 0D0*X<3>**2 

*  7 . 0D0 *X<  4) -1 . 2Q2 
PC  2) -TEMP 

TEMP -3. 0DO*X<l)**2+S.0DO*X(2)+<X(3)-4.0D0)**2-2.0D0*X(4)-4.0Dl 
P<  3) -TEMP 

TEMP- (3. OD-l ) *<X< 1 )-#. OOO) **2*2 . OOO* <X<2>-4 . 0D0>**2+ 

*  3.0DU*X(3)**2-X(4)-3.0Dl 
F(  4) -TEMP 

TEMP.X<1)**2*2. 0*<X<2)-2. ODO) **2-2 .  ODO*X<  l  >  *X<  2>  «-l . 4D1*X<3)- 
t  4 . 0D0*X( 4) 

P(3)«TEMP 

TEMP— 3. ODO*X(1)*4.0DO*X(2>*1 .  2D1* ( X (  7) .  OOO) **2-7. 0D0*X<10> 

P ( 4 ) -TEMP 

RETURN 
END 
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[6]  Dembos  Problem  no.  IB 


4.0,  i  -  1, 


,  12, 


V.  ' 


Starting  point:  x^  ■ 

f(x°)  -  0.22768, 
r(x°)  -  15.114. 

This  problem  was  solved  by  scaling  the  first  11  variables  by 
multiplying  each  by  10. 

Results: 


f(x*) 

e(x*) 
r(x*) 

This  is  a  difficult  problem  that  0PC0N  has  not  handled  very  well. 
It  is  a  geometric  programming  problem  that  is  poorly  scaled.  0PC0N 
required  300  iterations  to  achieve  the  results  shown,  and  the  optimal 
values  for  the  variables  are  not  as  close  as  one  would  like  to  those 
reported  by  Dembo  [1976] . 


Reported 

0PC0N  result 

3.169024101  3.1682133 

3.33E-04 
0.0 


Cl  C  l  o 


SUBROUTINE  FNCVALIX.F) 


DEMBO  PROBLEM  NO.  1  FROM  SAND CHENS  THESIS. 

DOUBLE  PRECISION  2(1), FU> 

C 

DOUBLE  PRECISION  TEMP , TEMPI , EPS, ONE, TEN 
DOUBLE  PRECISION  B ,A< 1 1 > . C< 30) 

DATA  EPS, ONE, TEN/ 1.00-13,1.000,1.001/ 

DATA  B, A/ t. 005,1. 331720-3, 2. 2707270-3,2. 483440-3, 4.47D0, 

+  4. 47177300,8. HD-3,1. 0720-3,5.0-3, 9. 070-4,8. 80-4,1. 170-3/ 

DATA  C/S.  3473730-2,2.  18437440-2 , 9 . 77335330-2 ,4 . 474080  3D- 3 , 

+  1 .0-4, 1 .0-3 , 1 .0-4 ,1.0-10,1.0-8,1.0-2,1.0-4,1.08784450-1, 

♦  1. 4 1 030520-4,1.0-23, 1. 730454 10-4,1.0-3,1.0-4,1 .0-5, 1.0-4, 

♦  1.0-7, 1.0-7, l.D-3, l.D-3, 1.08 734 450-1,1. 41 080320-5,1.0-23, 

♦  1.7304510-3,1.0-5,1.11840570-4,1.0-4/ 

C 

do  :  i*i,n 

X< I) -X< I) /TEN 
5  CONTINUE 
C 

TEMP-B 

00  10  1-1,11 

IF  (Z(I) .CT.EPS)  THEN 
TENFZ-Z(I) 

ELSE 

TEMPI -EPS 
END  IF 

TEMP -TEMP *TEME Z  *  4 < -A  < I ) > 

10  CONTINUE 
FID-TEMP 
C 

F(2 )-T£N*(C< 1 )*Z(1)+C(2)*Z<2)+C(3)*Z<3)+C(4)*Z(4)*Z( 3) -ONE) 

C 

F(3)-TEN*<C<3)*X< 1>*C< 4) *X< 2>*C< ?> *X( 3>*C ( 8) *X< 4) *X< 12 >♦ 

♦  C(7)*Z(S)  /!( 1 2) +C( 10)  *Z<4)  / Z ( 1 2 ) +C (  11  )*X(7)*X<12)*- 

+  C< 12>*X<4>*X<3)*C<13)*X<2>*X<S>/X<12>* 

♦  C(14)*Z<2)*X<4)*Z(3)+C(13)*Z(Z)*X<5)/(Z(4)*Z(12)>+ 

♦  C(14>*Z(10>/I( 12 )-ONE) 

C 

F ( 4 ) -TEN*  <  C  < 17)*Z(1)+C(18)*Z(2)+C(17)*X<3)*C<20)*Z<4)+C(21)*Z(3)  + 
*■  C(22)*X(4)+C(23)*X(8)+C(24)*Z<4)*X(S)+C(25)*X<2)*X<5)'f 

♦  C(24)*X(2)*X<4>*X(S)+C(27)*X(2)*X(3)/X(4)*C(28>*X(7)* 

♦  C<27)*X<i)*Z(7)*C(30)*Z(l 1 > -ONE) 

C 

00  20  1-1,11 

Z<I)-TEN*Xm 
20  CONTINUE 
C 

RETURN 
END 

/ 


W 


[7]  Colville's  Problem  No.  2 


References:  Sandgren  [1977],  problem  no.  14;  Hock  and  Schlctkowskl 
[1981],  problem  no.  117;  Himmelblau  [1972],  problem  no.  18. 


10  15  15  15 

min  f(x)  *  E  b^x^  +  S  E  Ci-10>  1-10  xix1  +  2  ^  d1-10  X1 

i-1  j-11  i-11  2  L  1  2  j-11  2  iU  2 


subject  to 


15 


10 


-2  ^  Chqj  ^  *ijXi  -  3djX10+j2  <  O.Oj,  j  -  1,  ...,  5, 


*4  ^  0,  j  —  1,  ...,  15. 


See  the  listing  for  coefficients. 


Starting  point:  x°j  -  0.001,  j  »  1,  ...,  15,  j  +  7; 

x°7  -  60.0, 
f(x°)  -  2400.1053, 


jWi 


Results 


OPCON 

f(x*)  32.34867897 

e(x*)  2.21E-05 


Reported 

result 

32.348679 

3.5E-05 


r(x*) 


6.20E-14 


0.0 


SUBROUTINE  FNCVAL ( X ,  F ) 


CQLVILLE3  PROBLEM  NO.  2 

DOUBLE  PRECISION  X(1).F(I) 

INTECER  I,J 

DOUBLE  PRECISION  ZERO 

DOUBLE  PRECISION  B<  10 ) , C< 3 , 5)  ,D< 5)  ,A(10 , 5) ,B1 ,C1 , C2 ,01 
DOUBLE  PRECISION  RHS(S) 

DATA  8/4.01,2.00,2.30-1,4.00,4.00,1.00,4.01,4.01 ,-3.00,-1.00/ 
DATA  0/4. DO, 0.00,1.01,4. DO, 2. DO/ 

DATA  C/3. Dl, -2. Dl, -1.01,3. 20 1,-1. 01,-2.01, 3. PD 1, -A. DO, -3. 1D1, 

♦  3. 2D 1,-1.01,-4.00,1.01, -A. DO, -1.01,3. 2D 1,-3.101, -A. DO, 

♦  3. POt, -2.01 , -1.01,3. 201, -1.01,-2.01 ,3.01/ 

DATA  A/-1.4D1  ,0.00, -3. 500,0.00 . 0.00,2.00, -1. DO, -1 .DO, 1 .DO, 1 .DO, 

♦  2. 00,-2. 00, 0.00, -2.  DO. ->.00,0.00,-1 .00  ,-2.00,2. 00,1..  DO, 

4-  0 . 00 , 0  .  DO  ,  2 . 00 , 0  .  DO  ,  -2  .  DO  ,  -4 . 00 ,  - 1 .  DO  ,  -3  .  DO  ,  3  .  DO  ,  1 .  DO  , 

♦  1.00,4.00,0.00,-4.00,1.00,0.00,-1.00,-2.00,4.00,1.00, 

♦  0. 00, 2..  DO,  0.  DO, -1.  DO, -2. 000,0.  DO, -1.  DO, -l  .00, 3.  DO,  l.  DO/ 
DATA  RH3/ IS. 0,27.0,34. 0, It. 0, 12. 0/ 

DATA  ZERO/ 0.00/ 

BlaZERO 
DO  10  U1.10 

SlaBl I) *X( I) 

10  CONTINUE 

C1-ZEH0 
DO  30  Jail. 13 
00  20  I«ll,15 

Cl»CUC(I-10,J-t0)*X<I  )*X(J> 

20  CONTINUE 
30  CONTINUE 

DUZERO 
00  40  Jml 1,15 

01«D1+D< J-10)*Z<J)**3 
40  CONTINUE 

?m«BUCl*2. 00*01 

DO  100  J. 1,5 
ClaZERO 
DO  50  I  a  1 1 ,15 

ClaCWC(I-10,J)*X(I> 

30  CONTINUE 

C2-ZER0 
DO  AO  Ial.10 

C2aC2*A<I,J)*X(I) 

AO  CONTINUE 

F< J  +  l )  sRHS  ( J)  -  2  .  DO  *C1  ♦  C2  -  3.D0*D(  J,y*X<  J*10)**2 


A-23 


References:  None 


Data: 


Targets 


min  - 


6  7  ,  . 

E  n.  n  (1  -  p  .)Xlj/n:1 
j-1  2  i-1  2 


7  6 

subject  to  £  c  2  x  <  4900 

i-1  j-1  2 


xlj  *  •••* 


Weapons 

1/  _L_ 

1  .50 

.30 
.10 
.05 
.68 
.43 
12 


pij 


2 

3 

4 

.58 

.42 

CM 

• 

.31 

.37 

.36 

.12 

.20 

.30 

.05 

.07 

.07 

.68 

.68 

.61 

.43 

.35 

.29 

12 

12 

15.6 

A-25 


This  is  a  weapon-allocation  problem  where  denotes  the  number  of 
weapons  of  type  i  to  be  allocated  to  the  class  of  targets  of  type  j. 

The  objective  function  is  the  negative  of  a  utility  for  a  given 
allocation.  The  constraint  is  a  volume  constraint  on  storage.  It  would 
also  be  a  cost  constraint.  The  variables  corresponding  to  p^^  being 
zero  are  not  considered  in  the  optimization,  so  there  are  only  32 
variables  in  the  problem. 

Start:  x^j  ■  2.0  for  all  i  and  j, 

f(x°)  -  -29.535, 
r(x°)  -  0.0. 

Result: 


0PC0N 

f(x*) 

-167.7054586 

e(x*) 

1.37E-02 

r(x*) 

1.02E-12 

This  problem  is  difficult  because  it  takes  a  long  time  to  determine 
which  variables  are  nonzero  (eight  are  nonzero  at  the  solution).  Also, 


convergence  is  slow. 


KM 


■  \  * 


SUBROUTINE  FNCVAL(X,F> 

THIS  IS  A  WEAPONS  ALLOCATION  PROBLEM. 

DOUBLE  PRECISION  Xtl).F(l) 

INTEGER  TARGET* 32), TARDEX, I, J 

DOUBLE  PRECISION  PX < 3 2 ) , TEMP , PXCUM  <  8  > , NTAR ( 8 ) 

DATA  NTAR/ 3. 0,40. 0, 55. 0,13.0, IS. 0,70.0/ 

DATA  PK/. 50, .30, .10, .05, .48, .43, .58, .31, .12, .05, .43, .43, .42, .37, 

♦  .20, .07, .88,. 39, .42, .38, .30, .07, .31, .29, .17, .77, .41, .40, 

♦  .59, .75, . 45, .90/ 

DATA  TARGET/ 1,2, 3, 4, 5, 8, 1,2, 3, 4, 5, 8, 1,2, 3. 4, 5, 8, 1,2, 3, 4, 5, 8, 2. 5, 8, 

♦  4, 5, 3, 4, 5/ 

DATA  ZERO, ONE/0. 000,1.000/ 

DO  10  J-1,8 

PKCUM ( J) -ONE 
LO  CONTINUE 

DO  20  1-1,32 

TARD EX-TARGET (I ) 

PXCUM  < TARDEX ) -PXCUM (TARDEX >  * ( ONE-PK < I ) ) * • <  X  <  » /NTAR (TARDEX ) ) 

20  CONTINUE 

TEMP-ZERO 
DO  30  J-1,8 

TEMP  -TEMP  -f  NTAR  ( J)  *  <  PXCUM  <  J )  -ONE  > 

30  CONTINUE 

• 

F ( 1 ) -TEMP 

RETURN 

END 
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[9]  World  Bank  Oil  Price  Model 


References:  None  (this  problem  was  obtained  from  A.  Drud  of  the  World 
Bank) . 


Pi  -  oil  price  in  year  i/10.0, 

t^  ■  total  demand  for  oil  in  year  i/10.0, 

s^  •  supply  of  oil  by  non-OPEC  countries  in  year  i, 

cSj^  ■  cumulative  supply  by  non-OPEC  countries  in 

year  i/10.0, 

d^  ■  demand  for  OPEC  oil  in  year  i/10.0,  and 
r^  ■  OPEC  reserves  of  oil  in  year  i/100.0. 


OPEC  oil  revenue  in  year  i  is  given  by 


10.0  d1(10.0  pA  -  2.5/^), 


OPEC  wants  to  maximize  discounted  oil  revenue  over  10  years. 


1  -j-1 

min  f(x)  -  -  z  10.0  d  (10.0  p  -  2.5/r^  (y^) 


subject  to 


10.0  tdt  -  +  l.Spi  -  1.0  -2.3  (— jjj) 

i  -  1,  ...»  10 

8i  ’  *75  8i-l  “  d*1  +  Pi)"C8i/7,°  -  0,  i  -  1, 

10.0  cs^  -  10.0  cs^_^  -  Sj_  ■  0,  i  ■  1 . 10 

10.0  -  td^  +  s^  ■  0,  1  ■  1,  ...,10 

10.0  —  10.0  +  d^  ■  0,  1  ■  1,  . ..,  10 

p^,  td.£,  s ^ ,  cs^ ,  J  0,  1  *  1,  ...,  1* 


This  problem  has  60  variables  and  50  equality  constraints 


Initial  values  for  year  0:  tdQ  ■  1.8, 

s0  -  6.5, 
csQ  -  0.0, 
rQ  -  5.0. 


Start:  p^ 

■ 

1.4,  1  •  1,  ..., 

10 

tdi 

- 

1.8,  1  *  1,  ..., 

10 

8i 

- 

7.0,  1  *  1,  ..., 

10 

dl 

m 

tdt  -  s^  1  -  1, 

•  •  •  f  10 

CSi 

m 

1 

E  V  1  "  L* 
J-l  2 

•  •  •  f  10 
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f(x  ) 

r(x°) 


-1198.1202 


-  11.565. 


Results: 


World  Bank 


0PC0N 

Optimization  (CONOPT) 

f(x*) 

-818.4235909 

-818.42359 

e(x*) 

5.34E-02 

— 

r(x*) 

3.23E-07 

— 

A-30 


-•Vis' 


u  tl  IJ 


exoutins  fncval  :  zduk  .  fbum) 


VORLD  SANX  ECONOMIC  MODEL  OF  OIL  PRICZS . 

TEN  YEAR  HORIZON . 

SCALED  VERSION. 

DOUBLE  PRECISION  2( 40 > . ? (1 15 . XDUMC I) , FDUMC 1) 

DOUBLE  PRECISION  P < 1 0 ) ,S( 10 > .CSC  1 0 > ,D( 1 0 > .R C 10 ) ,SEG ( 1 3  I 
EQUIVALENCE  :XCI>.?).(XC2I>,3),CX(3l>,CS>.<X(41>.D>, 

*  CXCS1) ,R) . CF(2> .SEC) 

DOUBLE  PRECISION  30 , C( ?) . ZERO .ONE .TEN 
DATA  ZERO. ONE. TEN/ 0. ODO, 1. 0D0.1 .0D1/ 

DATA  SO/ A. EDO/ 

DATA  C/2.SD0.1.0S30,. 75D0 , 1.100,1. DO, 1. 02D0 , 7 . OD-1 / 
TRANSFER  INPUT  .VALUE3  TO  SUBROUTINE  VARIABLES. 

DO  100  1-1.40 

xcd-xdumci; 

130  CONTINUE 


DO  203  1-1.13 

f::)-?<i)-ten*d<I)*cten*pci>-cci)/r(I))*cc2)**(i-i) 

IM1-I-1 

IF  ! I. EC. 13  THEN 

32Q < I) -3 (I ) -C<  3) *  3  0  —  <  C  C 4)*C(3)*P(I))*C(4)**( -CSC  1 ) /C ( 7) ) 
ELSE 

3EQC I)*S(I)-C(3)*S(IM1)-(C(4)+C(3)*P(I)>* 

I  C  ( 4  >  *  *  ( -C3  C I )  /  C  <  7  )  )• 

END  IF 
200  CONTINUE 

TRANSFER  LOCAL  FUNCTION  VALUES  TO  DUMMY  OUTPUT  VARIABLE. 

DO  30B  1*1,11 
FDUMC I)-F(I) 

300  CONTINUE 

RETURN 
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[10]  Madifled  Powell  Singular  Function  Problem 


References:  Buckley  and  Lenir  [1983] 


This  Is  a  modified  version  of  Powell's  singular  funccion  with 
constrainCs  added  to  create  a  large,  constrained  problem. 


1  lb  , 

*  1000  A  tcl(x4j-3  +  10x4j-2^  +  5(c2x41-1 


linear 


+  c3(x4 j-2  ‘  2x4j-l>4  +  10<x4j-3  "  c4x4j>4] 


subject  to  (see  the  listing  that  follows  for  the  code  for 
objective  function  and  10  nonlinear  constraints;  the 
constraints  appear  below) 


-10x^5  -  2x25  ~  -’x32  *  “10 
-2x^  -  5x^i  -  lOxjy  <  -10 
-5x13  ~  10x24  -  2x6Q  <  -15 
-5xjj  ■  2x^3  *  2x^  ^  -20 
-5x2g  “  6x31  -  20x53  -  3x5g  <  -10 

—10.0  ^  x^  ^  10.0,  1  ■  1,  ...,  60. 


‘j. 


SUBROUTINE  rNCVAL <  X,  F) 


W*. 

‘V.v . 
s’V 

Vv 

V 

o -,w 

o, 


c 

c 

c 

c 


A  CONSTRAINED  VERSION  OF  THE  POWELL  SINGULAR  FUNCTION 
MINIMIZATION  PROBLEM. 

DOUBLE  PRECISION  1(1), Ftl) 

INTEGER  J 

DOUBLE  PRECISION  TEMP, TWO, FIVE, TEN. ZERO 
DOUBLE  PRECISION  C<10> 


*  V’ 

j& 

vv 


1 

->> 


■r.J 

-V.’v 

Ml 


r.1- 


I 


8 

& 

m 


Is; 


DATA  ZERO. TWO, FIVE, TEN/ 0. 0,2. 0,5.0, 10.0/ 

OATA  C/1.02, .905,1.08, .943,2.05, .095,1.325, .974, 1. 114 ,1. 009/ 

TEMP-ZERO 

DO  10  J-1,15 

TEMP-TEMP+Ct l)*(X<4*J-3> *TEN*X < 4*J-2))**2  ♦ 

9  FIVE* <C( 2)*Xt4*J-l)-X<4*J) )  **  2  f 

9  C(  3)  *(  Z  ( 4*J-2> -TVO*XC  4*J-1 ) ) **4  ♦ 

9  TEN*tX<4*J-3)-C<4>*X<4*J>>**4 

10  CONTINUE 

Ft  D-TEMP/1.  0D3 

TEMP  .ZERO 

DO  20  J-1,20 

TEMP-TEMP-Ct  3) *(X<  3*J-2)+TVO*X( 3*J-1 ) ) **2  - 
9  CU>*<X<3*J-t)*FIVE*X<3*J>>**2  - 

3  TEN* (C(7)*X( 3* J-2) -X( 3*J) >  **2 

20  CONTINUE 

F(2)-(TEMP+3. 0D2)/ 1.003 

TEMP-ZERO 

DO  30  J-t ,i 

TEMP»TEMP-Ct»)*<X(10*J)+X(10*J-»))**2  - 
9  C<M*-<TVO*XtlO»J-5)-X<lO*J-»)  )**l  - 

9  FIVE*<C(10)*X( 10* J) ♦TVO*X< l0*J-3))**2 

30  CONTINUE 

F(3 )-(TSM?+4. 0D2) / 1 . 0D3 

TEMP -ZERO 

DO  40  J»1 , 3 

TSMP-TEMP-C< 1)*(X(20*J)-X(20*J-19)>**2  - 
9  C ( 2 ) * ( X ( 20  *J) +X ( 20  *J-7) ) **2  - 

9  C ( 3) *< X(20*J-7>-TVQ*X<  20*J-1 9 >  >  ** 2 

40  CONTINUE 

P(4  J-CTEMP+-4 . 5D2)  / 1 . 0D3 


/ 


TEMP -ZERO 
DO  SO  J-l ,9 

TEMP -TEMP -C<  4)*<X<10*J-1> ♦TVO*X( 10*J-I ) >  **2  - 
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*  <C<5)*X< lO*J-l)-FlVE*X<lO*J-4> )**:  - 

%  CU>  *<X<  10*J-«>VTEN*X<  10*J-4)  )**2 

SO  CONTINUE 
C 

Ft  3 )a (TEMP* 7 . 002) / ! .  0D3 
C 

TEMP -ZERO 
00  40  J«l,10 

TSMPaTEMP-C<  7) * (X  <  4* J) -TVO*X ( 4* J-3 ) )  **2  - 
«  FIVE*  <X<  4*J) *C <8)*X(4*J-3) )  *  *2  - 

*  (C(7)*X(4*J-5 )-TVO*X ( 4* J-3 ) )**2 
40  CONTINUE 

C 

E(4 )a(TEMP*l . 003) / 1.003 
C 

TEMP a ZERO 
00  70  J«I , 3 

TEMP»TEMP-C< I0)*(X(20*J-4) ♦TVO*X( 20* J-14  > ) **2  - 

*  TWO* (C ( l)*X(20*J-4)-X(20*J-10))**2  - 

*  <X<20*J-10>-Ct2>*XU0*J-U)>**2 
70  CONTINUE 

C 

F(7)*<TEMP-fl .  003)  / 1 . 003' 

C  ’ 

TEMP - ZERO 
00  00  Jal ,12 

TEMPaTEMP-<C (3) *X <  5*J)-TVO*XC3*J-2>  >  **2  - 

*  TEN*(X(3*J)+C(4)*X(5*J-4) )**2  - 

J  .  <X<3»J-4>*X<3*J-2) >**2 

10  CONTINUE 
C 

F(0 )a<TEMP+l . 003) / 1 . 003 
C 

F(7)a(-(C(S>*X( 10 )-TVO*X ( 1 1 ) ) **2  -  X(10>*X<11)  ♦  2.001)/!. 003 
C 

F<  10)a<-(C(4)*X(30)-TEN*X(40))**2  -  X(50)*X(40)  ♦  S.0DD/1.0D3 
C 

rc  l!)a(-(C(7)*X(21)*X(22)+C(8)*X(23))**2  ♦  7.0DD/1.003 
C 

•  RETURN 
END 
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[11]  Modified  EXTROS  Function  Problem 


Reference:  Buckley  and  Lenlr  [1983] 


This  Is  a  modified  version  of  the  EXTROS  function  described  In  the 
reference.  The  constraint  functions  are  similar  to  those  for  the 
preceding  problem. 


«  J  w  0  0  0 

min  f(x)  *  £^  [100  “  x2i-l^  +  (l  “  x2i-i)  1 


subject  to  (see  the  listing  that  follows  and  the  linear 
constraints  for  the  preceding  problem) . 

Start:  x0^  -  1.0,  i  ■  1,  ...,  60 
f(x°)  -  0.0 
r(x°)  -  1238.9. 

Result: 


f(x*) 

e(x*) 

r(x*) 


0PC0N 


0.1289735371 


9.78E-05 


iHVi’VJ.'Vj.V.HTb,' 


o  n  o  o 


SUBROUTINE  FNCVALIX.F) 


A  CONSTRAINED  VERSION  OF  THE  EXTROS  FUNCTION 
MINIMIZATION  PROBLEM. 

DOUBLE  PRECISION  XU), FU> 

C 

INTEGER  J 

DOUBLE  PRECISION  TEMP , ONE, TWO , FIVE , TEN, HUNDRD , ZERO 
DOUBLE  PRECISION  C<24> 

C 

DATA  ZERO , ONE, TWO, FI VE, TEN, HUNDRD/ 0. DO, 1. DO, 2. DO, 3. DO, 1.01,1. 0D2/ 
DATA  C/1. 02, .903,1. OS, .943, 2. OS, .895. 1.325. .974,1. 114, 1.009, 
t  2.37,8.04  ,-4.237,4.10  9,5.95  99,4.2,-2..  1,8.39,5.39,8.4  21. 

i  5.0432,7.211,4.903,1.327/ 

C 

TEMP. ZERO 
DO  10  J.1,30 

TEMP.TEMP+ HUNDRD*  <X<  2*J)-X( 2*J-1 ) **2) **2+(ONE-X<  2*J-1 ) )  **2 
10  CONTINUE 
C 

F 1 1 )  - TEMP /HUNDRD 
C 

TEMP. ZERO 
DO  20  J.l ,20 

TEMP.TEMP-C( 1 )  * (X  <  3* J-2) +TVO*X ( 3*J- l > )**2  - 

*  C(2)*(X(3*J-1)*FIVE*X(3*J))**2  - 

3  TEN* (C  < 3)*X(3*J-2)-X(3*J>>**2 

20  CONTINUE 
C 

F( 2 ) .TEMP +5 . 0D2 
C 

TEMP -ZERO 
DO  30  J.l ,4 

TEMP. TEMP -C< 4) *(X(1Q*J)>X( l0*J-9) ) **2  - 

*  C< 5 ) *<TVO*X <10*J-5)-X<10*J-9))**2  - 

*  PIVE*(C(4)*X( 10*J) ♦TVO*XC 1 0* J-3 ) )**2 
30  CONTINUE 

C 

F ( 3  5.TEMP  +  4 . 0D2 
C 

TEMP. ZERO 
DO  40  J.l ,3 

TEMP.TEMP-C<  7)*(X(20*J)-X(20*J-19))**2  - 

*  C(8)*(X(20*J)+X(20*J-7))**2  - 

*  C(9)*<X(20*J-7) -TVO*X<  20*J-1 9 ) ) **2 
40  CONTINUE 

C  - 

F(4)«T£MP*4 . SD2 
C 

TEMP. ZERO 
DO  SO  J.l ,4 

TEMP*TEMP-C< 10)*(X(10*J-t  >+TVO*X< 10*J-8))**2  - 

*  <C(ll)*XU0*J-l)-FIVE*X(10*J-4>  >»*2  - 
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*  C(12)*(X(10*U-8) +TEN*X< 10*J-4))**2 
SO  CONTINUE 

W 

r (S )»TEM?»7 . 002 
C 

TEH? •ZERO 
00  40 

TSMP-TEMP-C< 13>*<X(4*J) -TVO*X  <  4*J-3))**2  - 

*  FIVE*<X(4*J)+C(14)*X<4*J-3) )**2  - 

*  (C(13)*X<4*J-5) -TWO*X ( 4* J-3 ) >**2 
40  CONTINUE 

C 

F(4 )«T£M?+1 . 003 
C 

TEMP •ZERO 
DO  70  J>1 . 3 

TEMP-TEMP -CC 14)*<X(20*J-4) +TWO*X( 20*J- 14) >  *  *  2  - 

*  TWO* <C(17)*X<20*J-4)-X<20*J-10))**2  - 

9  <X(20*J-10)-C(18>*X(20*J-14>)**2 

70  CONTINUE 
C 

FC7)-TEMP*1.0D3 

C 

TEMP -ZERO 
00  80  J«i f 12 

TEMP-TEMP-IC < 17 ) *X(  5*<J>-TVO*X(  5*J-2) )  **2  - 

*  TEN*<X<5*J>*C<20)*X(5*J-4> >*»2  - 

9  <X(5*J-4)+X(3*J-2))**2 

80  CONTINUE 
C 

F  ( 8 ) -TEMP> 1 . 0D3 
C 

F(  »)  —  <C<21)*X(10)-TVO*X(in  >**2  -  X(10)*X<11>  ♦  2.001 
C 

F( 10 ) a- (C ( 22 )*X(50) -TEN*X <40))**2  -  X(30)*X(  40)  ♦  5.001 
C 

F(ll)a-(C(23)*X(21)+X<22)+C(24)*X<23))**2  ♦  7.001 
C 

RETURN 

END 


t 
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In  this  appendix,  we  solve  the  problem  of  minimizing  a  strictly 
convex  quadratic  function  over  a  set  of  "minimally  infeasible"  points 


defined  by  a  set  of  inconsistent  linear  inequalities.  Let  the  original 
quadratic  program  be  given  by 

(P)  min  (l/2)x'Qx  +  q'x 
x  s  Rn 

subject  to  Ax  <  b, 

where  Q  is  a  positive  definite  n  x  n  matrix,  q  is  an  n  x  1  vector,  A  is 
an  m  x  n  matrix,  and  b  is  an  m  x  1  vector.  It  can  be  shown  that  the 
following  problem  is  dual  to  (P). 

(D)  min  (l/2)w'Kw  +  k'w 

w  e  Rm 

subject  to  w  >  0, 

where  K  ■  AQ~^A'  and  k  ■  AQ~^q+b.  If  (P)  is  feasible,  then  (P)  has  a 
solution  x*,  (D)  has  a  solution  w* ,  and 


x*  ■  -Q_i[A'w*  +  q] 


(See  Che  discussion  in  section  1.3.1.) 

We  now  consider  Che  case  when  (F)  is  infeasible.  To  analyze  this 
case,  we  use  the  perturbed  problem  in  (x,  s) : 

(P£)  min  (l/2)x'Qx  +  q'x  +  s's 
x  e  Rn 
s  e  Rm 

subject  to  Ax  +  s  <  b, 

with  e  >  0.  Clearly,  (P_)  is  always  feasible.  Note  that  (P_)  is  not 
defined  at  e  ■  0,  so  that  it  is  not  clear  that  as  e  0  the  solution  of 
(P£)  will  be  related  to  the  solution  of  (P) .  Also  note  that  s  is  not  a 
slack  variable  since  it  is  unconstrained  in  sign. 

Given  any  x,  it  is  necessary  to  choose  s  so  that  for  each  i, 
i  *  1 ,  ... ,m: 

(Ax  -  b)i  <  0  *  Si  -  0 

(Ax  -  b)^  >  0  •+•  s^  ■  -(Ax  -  b)^ 


in  order  to  minimize  the  objective  function  of  (?.)  over  s.  Thus  we 


Prop .  B1 :  (P£)  is  equivalent  to  the  unconstrained  problem 

min  (l/2)x'Qx  +  q'x  +  (Ax  -  b)+'(Ax  -  b)+  . 
x 

Let 


Z  ■  {x:  j(Ax  -  tO+d  <  |  CAz  -  b)+|2  for  all  2  2  Rn}, 

i.e.,  Z  is  the  set  of  points  in  Rn  closest  to  feasibility  in  that  the 
residual  vector  is  smallest  in  norm.  Note  that  Z  is  a  convex  set  with 
Z  ■  {x:  Ax  <  b  }  if  the  latter  set  is  nonempty. 


Example  Bl:  A  « 

1  1 

-1  0 

and  b  * 

-1 

0 

0  -1 

0 

Define  the  constraints: 

xi  +  x2  <  -1 
x^  >  0 

X2  >  0 


which  are  clearly  inconsistent.  In  this  case,  Z 


1 

-1 


1 

-1 


{(-1/4,  -1/4)}. 


Example  B2: 


A 


and  b 


Define  the  constraints: 


xi  +  x2  <  -1 
xi  +  x2  >2 

which  are  also  inconsistent.  In  this  case,  Z  *  {x:  +  x2  ’ 

(P£)  is  a  strictly  convex  program  so  that  for  every  e  > 
exists  a  unique  solution  (xe,  se)  with 

(Bl)  se  -  -( Axe  -  b)+. 


Let 


y  »  | (Ax  -  b)+ 1 2  fo r  x  e  Z  (y  >  0), 

T)  -  min  {(l/2)x’Qx  +  q'x}, 
x 


and 


C  *  min  (l/2)x'Qx  +  q'x  (C  >  v)  • 
x  e  Z 


1/2}. 

0  there 


Denoting 


Qg(x,s)  -  (l/2)x'Qx  +  q'x  +  s’s  “  Q(x>  +  2e  s's» 
where  Q(x)  -  (l/2)x'Qx  +  q'x,  we  have 

h  +  2^  r  <  Qg(xe,  se)  <  C  +  27  Y» 

which  implies  Qg(xs,  ss)  ■*-  +  «  as  e  0  if  and  only  if  y  >  0.  Thus 

(B2)  n  <  Qe(xe,  se)  -  y  -  Q(xe)  +  (se'se  -  y)  <  C- 

Now  it  follows  from  (Bl)  that  se'se  -  y  >  0.  Therefore,  since  |xe| 
implies  Q(xe)  >  +  <*»,  (B2)  implies  that  {xe>  must  remain  bounded  and 
hence 

(B3)  (se'se  -  y)  -  0(e). 

We  are  led  to  the  following  result. 

Prop .  B2 :  Let  {xe,  s6))  be  the  family  of  solutions  to  (P£).  Then 

lim  (xe,  se)  -  (x  ,  s  ) 
e  ♦  0+ 

A 

where  x  is  the  unique  solution  to  the  problem 


min  (l/2)x*Qx  +  q'x 
x 

subject  to  x  e  Z. 

Pf :  It  follows  from  the  earlier  comments  that  {x6}  is  bounded. 

A  C 

Let  x  be  any  limit  point  of  {xe},  i.e.,  there  is  a  sequence  {e^}  -*■  0 
such  that 


Eg  A 

lim  x  A  ■  x 
l  ® 


It  follows  from  (Bl)  that 

s  »  lim  s  *  ■  -lim  (Ax*’-*’  -  b)+  -  -(Ax  -  b)+ 
l  -*•  •  A  ♦  « 

A  A  A 

exists  and  from  (B3)  that  s  's  ■  y.  Thus,  x  e  Z.  Now  suppose,  for 
contradiction,  there  is  an  x°  e  Z  such  that 

Q(x)  -  (l/2)x'Qx  +  q'x  >  (l/2)x°'Qx°  +  q'x°  -  Q(x°) . 


Let  s°  »  -( Ax°  -  b). .  Then,  for  any  e« 


> 


Qg  (x°,  3°)  -  Q  (x6^,  3 e*-)  - 
eJ l  ea 

Q(x°)  -  Q(x6a)  +  (Y  -  |  sCjl||)  <  0 

£n  e, 

which  contradicts  the  definition  of  (x  s  A) .  j 

A  A 

If  (P)  is  feasible,  (x,  s)  ■  (x*,  u*),  where  u*  is  the  multiplier  of  the 
dual  problem  to  (P)  and  x*  is  the  solution  to  (P).  Proposition  B2  is  a 
special  case  of  the  penalty  function  theory  for  nonlinear  programming. 

Let  (xe,  se)  be  the  solution  to  (P£)  and  let 

Je  “  {j:  (Axe  -  b)j  >  0>, 

i.e.,  Jg  is  the  index  set  of  violated  constraints.  Using  the 
unconstrained  form  of  (Pg)  (see  Problem  Bl)  we  have  that  xe  is  a 
solution  of 

min  (l/2)x'Qx  +  q'x  +  |(Ax  -  b)+|| 

► 

if  and  only  if 

6 

Qxs  +  q  +  7  {  E  A,  (Ax6  -  b).}  -  0. 
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fey 


fc; 


ft 


i 


* 


V/ 

■v 

".v. 

Si 

& 

& 

S; 

£:> 


-■ 


;V 

.M.v 


Kv 
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To  construct  a  dual  for  (Pg),  we  consider  the  Lagrangian  function 


Le(x,  s,  w)  -  (l/2)x'Qx  +  q'x  +  s's  +  w'(Ax  +  s  -  b) 


2e 


(Pg)  is  equivalent  to 


inf  sup  Lg(x,  s,  w) 


(x,s)  w  >  0 


inf 

(x,s) 


Qe(x,  s) :  Ax  +  s  <  b 


-H»  :  otherwise 


and  we  define  the  dual  problem  to  be 


sup  inf  L£(x,  s,  w) . 
w  >  0  (x,s) 


For  a  given  w  >  0,  we  have  that  (xw,  sw)  minimizes  Lg(x,  s,  w)  if  and 
only  if 


is  equivalent  to  solving  (Pg).  Finally,  we  obtain  the  following 
theorem. 

Theorem  (Theorem  3  of  Section  1.3.1):  Let  {we>  be  the  family  of 
solutions  to  (Dg)  for  positive  values  of  e  and  for  each  e  let  xe  be 
given  by 

xe  ■  -Q'^tA’w6  +  3]  • 

Then 

lim  xe  -  x’ 
e  ■*  0+ 

where  x  is  a  solution  of  the  problem 

min  (l/2)x'Qx  +  q'x. 
x  e  Z 

Pf :  Propositions  Bl,  B2,  and  B3  imply  the  theorem  for  problems 
having  inequality  constraints.  Extension  of  the  proof  to  Include 
equality  constraints  is  straightforward. 
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